Server Admin Log

From Wikitech
Jump to navigation Jump to search

2022-11-29

  • 20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P41792 and previous config saved to /var/cache/conftool/dbconfig/20221129-201739-ladsgroup.json
  • 20:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P41791 and previous config saved to /var/cache/conftool/dbconfig/20221129-201533-marostegui.json
  • 20:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T323907)', diff saved to https://phabricator.wikimedia.org/P41790 and previous config saved to /var/cache/conftool/dbconfig/20221129-200233-ladsgroup.json
  • 20:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P41789 and previous config saved to /var/cache/conftool/dbconfig/20221129-200027-marostegui.json
  • 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T321126)', diff saved to https://phabricator.wikimedia.org/P41788 and previous config saved to /var/cache/conftool/dbconfig/20221129-194520-marostegui.json
  • 19:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2130 (T321126)', diff saved to https://phabricator.wikimedia.org/P41787 and previous config saved to /var/cache/conftool/dbconfig/20221129-194257-marostegui.json
  • 19:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 19:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 19:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T321126)', diff saved to https://phabricator.wikimedia.org/P41786 and previous config saved to /var/cache/conftool/dbconfig/20221129-194235-marostegui.json
  • 19:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P41785 and previous config saved to /var/cache/conftool/dbconfig/20221129-192728-marostegui.json
  • 19:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 (T323907)', diff saved to https://phabricator.wikimedia.org/P41784 and previous config saved to /var/cache/conftool/dbconfig/20221129-192628-ladsgroup.json
  • 19:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 19:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 19:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41783 and previous config saved to /var/cache/conftool/dbconfig/20221129-192606-ladsgroup.json
  • 19:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P41782 and previous config saved to /var/cache/conftool/dbconfig/20221129-191220-marostegui.json
  • 19:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P41781 and previous config saved to /var/cache/conftool/dbconfig/20221129-191100-ladsgroup.json
  • 18:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T321126)', diff saved to https://phabricator.wikimedia.org/P41780 and previous config saved to /var/cache/conftool/dbconfig/20221129-185714-marostegui.json
  • 18:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P41779 and previous config saved to /var/cache/conftool/dbconfig/20221129-185553-ladsgroup.json
  • 18:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2116 (T321126)', diff saved to https://phabricator.wikimedia.org/P41778 and previous config saved to /var/cache/conftool/dbconfig/20221129-185450-marostegui.json
  • 18:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 18:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T321126)', diff saved to https://phabricator.wikimedia.org/P41777 and previous config saved to /var/cache/conftool/dbconfig/20221129-185429-marostegui.json
  • 18:43 sukhe: sukhe@cumin2002:~$ sudo ipmitool -I lanplus -H "cp5021.mgmt.eqsin.wmnet" -U root -E chassis power cycle
  • 18:42 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5021.eqsin.wmnet with OS buster
  • 18:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41776 and previous config saved to /var/cache/conftool/dbconfig/20221129-184047-ladsgroup.json
  • 18:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P41775 and previous config saved to /var/cache/conftool/dbconfig/20221129-183922-marostegui.json
  • 18:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 18:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 18:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 18:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 18:28 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5021.eqsin.wmnet with OS buster
  • 18:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P41774 and previous config saved to /var/cache/conftool/dbconfig/20221129-182416-marostegui.json
  • 18:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T321126)', diff saved to https://phabricator.wikimedia.org/P41773 and previous config saved to /var/cache/conftool/dbconfig/20221129-180909-marostegui.json
  • 18:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2103 (T321126)', diff saved to https://phabricator.wikimedia.org/P41772 and previous config saved to /var/cache/conftool/dbconfig/20221129-180646-marostegui.json
  • 18:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 18:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 18:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 18:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 18:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 18:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 18:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 18:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 18:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T321126)', diff saved to https://phabricator.wikimedia.org/P41771 and previous config saved to /var/cache/conftool/dbconfig/20221129-180451-marostegui.json
  • 18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41770 and previous config saved to /var/cache/conftool/dbconfig/20221129-180408-ladsgroup.json
  • 18:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 18:03 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dns5004']
  • 18:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 18:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41769 and previous config saved to /var/cache/conftool/dbconfig/20221129-180347-ladsgroup.json
  • 18:02 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti5004']
  • 17:54 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5025']
  • 17:52 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns5004']
  • 17:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1205.eqiad.wmnet with OS bullseye
  • 17:51 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti5004']
  • 17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P41768 and previous config saved to /var/cache/conftool/dbconfig/20221129-174945-marostegui.json
  • 17:49 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5027']
  • 17:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P41767 and previous config saved to /var/cache/conftool/dbconfig/20221129-174840-ladsgroup.json
  • 17:48 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5026']
  • 17:47 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5024']
  • 17:45 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1204.eqiad.wmnet with OS bullseye
  • 17:42 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5025']
  • 17:41 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp5025']
  • 17:37 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5027']
  • 17:36 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5026']
  • 17:36 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
  • 17:36 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5025']
  • 17:35 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5024']
  • 17:35 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5023']
  • 17:34 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti5004.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:34 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5022']
  • 17:34 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5021']
  • 17:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P41766 and previous config saved to /var/cache/conftool/dbconfig/20221129-173438-marostegui.json
  • 17:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P41765 and previous config saved to /var/cache/conftool/dbconfig/20221129-173334-ladsgroup.json
  • 17:33 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
  • 17:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
  • 17:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:31 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for db1206 - pt1979@cumin2002"
  • 17:30 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for db1206 - pt1979@cumin2002"
  • 17:28 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 17:26 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
  • 17:23 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5023']
  • 17:22 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5022']
  • 17:22 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5021']
  • 17:22 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dns5004.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:21 robh@cumin2002: START - Cookbook sre.hosts.provision for host ganeti5004.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db1205.eqiad.wmnet with OS bullseye
  • 17:20 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti5004.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T321126)', diff saved to https://phabricator.wikimedia.org/P41764 and previous config saved to /var/cache/conftool/dbconfig/20221129-171931-marostegui.json
  • 17:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41763 and previous config saved to /var/cache/conftool/dbconfig/20221129-171827-ladsgroup.json
  • 17:18 otto@deploy1002: Finished deploy [analytics/refinery@c45b61d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c45b61d] - an-test-coord1001 only (duration: 00m 04s)
  • 17:18 otto@deploy1002: Started deploy [analytics/refinery@c45b61d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c45b61d] - an-test-coord1001 only
  • 17:17 otto@deploy1002: Finished deploy [analytics/refinery@c45b61d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c45b61d] (duration: 01m 03s)
  • 17:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1196 (T321126)', diff saved to https://phabricator.wikimedia.org/P41762 and previous config saved to /var/cache/conftool/dbconfig/20221129-171710-marostegui.json
  • 17:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 17:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 17:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T321126)', diff saved to https://phabricator.wikimedia.org/P41761 and previous config saved to /var/cache/conftool/dbconfig/20221129-171638-marostegui.json
  • 17:16 otto@deploy1002: Started deploy [analytics/refinery@c45b61d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c45b61d]
  • 17:16 otto@deploy1002: Finished deploy [analytics/refinery@c45b61d] (thin): Regular analytics weekly train THIN [analytics/refinery@c45b61d] (duration: 00m 09s)
  • 17:15 otto@deploy1002: Started deploy [analytics/refinery@c45b61d] (thin): Regular analytics weekly train THIN [analytics/refinery@c45b61d]
  • 17:15 otto@deploy1002: Finished deploy [analytics/refinery@c45b61d]: Regular analytics weekly train [analytics/refinery@c45b61d] (duration: 03m 54s)
  • 17:15 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5027.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:14 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db1204.eqiad.wmnet with OS bullseye
  • 17:13 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: sync
  • 17:13 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: sync
  • 17:12 robh@cumin2002: START - Cookbook sre.hosts.provision for host dns5004.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:11 otto@deploy1002: Started deploy [analytics/refinery@c45b61d]: Regular analytics weekly train [analytics/refinery@c45b61d]
  • 17:11 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs5004.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:04 robh@cumin2002: START - Cookbook sre.hosts.provision for host ganeti5004.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:04 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5026.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:04 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs5004.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:03 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5027.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:03 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5025.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:02 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5024.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P41760 and previous config saved to /var/cache/conftool/dbconfig/20221129-170131-marostegui.json
  • 16:53 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5026.mgmt.eqsin.wmnet with reboot policy FORCED
  • 16:52 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5023.mgmt.eqsin.wmnet with reboot policy FORCED
  • 16:51 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5025.mgmt.eqsin.wmnet with reboot policy FORCED
  • 16:51 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5024.mgmt.eqsin.wmnet with reboot policy FORCED
  • 16:50 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 16:50 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5021.mgmt.eqsin.wmnet with reboot policy FORCED
  • 16:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 16:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 16:49 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5022.mgmt.eqsin.wmnet with reboot policy FORCED
  • 16:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 16:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P41759 and previous config saved to /var/cache/conftool/dbconfig/20221129-164624-marostegui.json
  • 16:41 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5023.mgmt.eqsin.wmnet with reboot policy FORCED
  • 16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41758 and previous config saved to /var/cache/conftool/dbconfig/20221129-163942-ladsgroup.json
  • 16:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 16:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T323907)', diff saved to https://phabricator.wikimedia.org/P41757 and previous config saved to /var/cache/conftool/dbconfig/20221129-163921-ladsgroup.json
  • 16:38 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5022.mgmt.eqsin.wmnet with reboot policy FORCED
  • 16:38 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5021.mgmt.eqsin.wmnet with reboot policy FORCED
  • 16:37 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5021']
  • 16:37 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5021']
  • 16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T321126)', diff saved to https://phabricator.wikimedia.org/P41756 and previous config saved to /var/cache/conftool/dbconfig/20221129-163118-marostegui.json
  • 16:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1186 (T321126)', diff saved to https://phabricator.wikimedia.org/P41755 and previous config saved to /var/cache/conftool/dbconfig/20221129-162857-marostegui.json
  • 16:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 16:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 16:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T321126)', diff saved to https://phabricator.wikimedia.org/P41754 and previous config saved to /var/cache/conftool/dbconfig/20221129-162835-marostegui.json
  • 16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P41753 and previous config saved to /var/cache/conftool/dbconfig/20221129-162414-ladsgroup.json
  • 16:23 sukhe@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for 16 hosts
  • 16:23 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for 16 hosts
  • 16:21 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns5004
  • 16:20 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns5004
  • 16:20 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5004
  • 16:19 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5004
  • 16:19 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs5004
  • 16:19 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs5004
  • 16:19 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5027
  • 16:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 16:18 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5027
  • 16:18 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5026
  • 16:18 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5026
  • 16:18 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5025
  • 16:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 16:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 16:18 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5025
  • 16:18 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5024
  • 16:18 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5024
  • 16:18 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5023
  • 16:18 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5023
  • 16:18 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5022
  • 16:17 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5022
  • 16:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P41752 and previous config saved to /var/cache/conftool/dbconfig/20221129-161604-ladsgroup.json
  • 16:17 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5021
  • 16:17 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5021
  • 16:16 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:14 robh@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: eqsin hosts - robh@cumin2002"
  • 16:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 16:14 oblivian@deploy1002: Synchronized wmf-config/reverse-proxy.php: test deployment (duration: 04m 28s)
  • 16:13 robh@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: eqsin hosts - robh@cumin2002"
  • 16:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P41751 and previous config saved to /var/cache/conftool/dbconfig/20221129-161329-marostegui.json
  • 16:12 oblivian@cumin1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=mw14(89|9).*
  • 16:11 robh@cumin2002: START - Cookbook sre.dns.netbox
  • 16:09 oblivian@deploy1002: Synchronized wmf-config/reverse-proxy.php: test deployment (duration: 04m 35s)
  • 16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P41750 and previous config saved to /var/cache/conftool/dbconfig/20221129-160907-ladsgroup.json
  • 16:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 16:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 16:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 16:04 oblivian@deploy1002: Synchronized wmf-config/reverse-proxy.php: test deployment (duration: 04m 36s)
  • 16:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 16:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P41749 and previous config saved to /var/cache/conftool/dbconfig/20221129-160059-ladsgroup.json
  • 15:58 oblivian@cumin1001: conftool action : set/pooled=no; selector: dc=eqiad,name=mw14(89|9).*
  • 15:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P41748 and previous config saved to /var/cache/conftool/dbconfig/20221129-155822-marostegui.json
  • 15:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T323907)', diff saved to https://phabricator.wikimedia.org/P41747 and previous config saved to /var/cache/conftool/dbconfig/20221129-155401-ladsgroup.json
  • 15:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1204']
  • 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P41746 and previous config saved to /var/cache/conftool/dbconfig/20221129-154554-ladsgroup.json
  • 15:45 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1204']
  • 15:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T321126)', diff saved to https://phabricator.wikimedia.org/P41745 and previous config saved to /var/cache/conftool/dbconfig/20221129-154316-marostegui.json
  • 15:42 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db1204']
  • 15:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T321126)', diff saved to https://phabricator.wikimedia.org/P41744 and previous config saved to /var/cache/conftool/dbconfig/20221129-154055-marostegui.json
  • 15:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 15:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 15:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T321126)', diff saved to https://phabricator.wikimedia.org/P41743 and previous config saved to /var/cache/conftool/dbconfig/20221129-154033-marostegui.json
  • 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P41742 and previous config saved to /var/cache/conftool/dbconfig/20221129-153049-ladsgroup.json
  • 15:25 Emperor: set thanos ring replicas to 3.0 T311690
  • 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P41741 and previous config saved to /var/cache/conftool/dbconfig/20221129-152526-marostegui.json
  • 15:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1205']
  • 15:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2159 (T323907)', diff saved to https://phabricator.wikimedia.org/P41740 and previous config saved to /var/cache/conftool/dbconfig/20221129-151647-ladsgroup.json
  • 15:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 15:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 15:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 15:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 15:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T323907)', diff saved to https://phabricator.wikimedia.org/P41739 and previous config saved to /var/cache/conftool/dbconfig/20221129-151609-ladsgroup.json
  • 15:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P41737 and previous config saved to /var/cache/conftool/dbconfig/20221129-151020-marostegui.json
  • 15:07 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on an-worker1089.eqiad.wmnet with reason: replacing RAID controller battery
  • 15:06 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on an-worker1089.eqiad.wmnet with reason: replacing RAID controller battery
  • 15:03 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1205']
  • 15:03 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1204']
  • 15:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 15:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 15:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 15:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P41735 and previous config saved to /var/cache/conftool/dbconfig/20221129-150103-ladsgroup.json
  • 15:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 15:00 hnowlan: removing /srv/cassandra on all maps hosts
  • 15:00 oblivian@cumin1001: conftool action : set/pooled=inactive; selector: dc=eqiad,name=mw14(89|9).*
  • 14:58 oblivian@deploy1002: Synchronized wmf-config/reverse-proxy.php: test deployment (duration: 04m 13s)
  • 14:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T321126)', diff saved to https://phabricator.wikimedia.org/P41734 and previous config saved to /var/cache/conftool/dbconfig/20221129-145513-marostegui.json
  • 14:54 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 6 hosts with reason: replacing RAID controller battery
  • 14:54 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on 6 hosts with reason: replacing RAID controller battery
  • 14:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 14:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 14:51 taavi@deploy1002: Finished scap: testing a scap sync (duration: 05m 17s)
  • 14:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T321126)', diff saved to https://phabricator.wikimedia.org/P41732 and previous config saved to /var/cache/conftool/dbconfig/20221129-144952-marostegui.json
  • 14:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 14:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 14:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 14:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 14:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 14:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 14:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T321126)', diff saved to https://phabricator.wikimedia.org/P41731 and previous config saved to /var/cache/conftool/dbconfig/20221129-144831-marostegui.json
  • 14:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P41730 and previous config saved to /var/cache/conftool/dbconfig/20221129-144556-ladsgroup.json
  • 14:45 taavi@deploy1002: Started scap: testing a scap sync
  • 14:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1205.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1204.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P41729 and previous config saved to /var/cache/conftool/dbconfig/20221129-143324-marostegui.json
  • 14:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T323907)', diff saved to https://phabricator.wikimedia.org/P41728 and previous config saved to /var/cache/conftool/dbconfig/20221129-143049-ladsgroup.json
  • 14:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 14:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 14:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 14:27 taavi@deploy1002: Finished scap: re-syncing the backport to see if the errors fix themself (duration: 04m 58s)
  • 14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 14:22 taavi@deploy1002: Started scap: re-syncing the backport to see if the errors fix themself
  • 14:22 taavi@deploy1002: Finished scap: Backport for reverse-proxy: Add eqiad e/f[1-4] subnets (T324018) (duration: 07m 33s)
  • 14:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 14:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 14:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 14:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 14:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P41726 and previous config saved to /var/cache/conftool/dbconfig/20221129-141818-marostegui.json
  • 14:16 taavi@deploy1002: taavi and taavi: Backport for reverse-proxy: Add eqiad e/f[1-4] subnets (T324018) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 14:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 14:14 taavi@deploy1002: Started scap: Backport for reverse-proxy: Add eqiad e/f[1-4] subnets (T324018)
  • 14:12 mbsantos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
  • 14:11 mbsantos@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
  • 14:10 mbsantos@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
  • 14:09 mbsantos@deploy1002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
  • 14:08 mbsantos@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 14:08 mbsantos@deploy1002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 14:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T321126)', diff saved to https://phabricator.wikimedia.org/P41725 and previous config saved to /var/cache/conftool/dbconfig/20221129-140311-marostegui.json
  • 14:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T321126)', diff saved to https://phabricator.wikimedia.org/P41724 and previous config saved to /var/cache/conftool/dbconfig/20221129-140050-marostegui.json
  • 14:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 14:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 14:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T321126)', diff saved to https://phabricator.wikimedia.org/P41723 and previous config saved to /var/cache/conftool/dbconfig/20221129-140018-marostegui.json
  • 13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 (T323907)', diff saved to https://phabricator.wikimedia.org/P41722 and previous config saved to /var/cache/conftool/dbconfig/20221129-135549-ladsgroup.json
  • 13:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 13:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T323907)', diff saved to https://phabricator.wikimedia.org/P41721 and previous config saved to /var/cache/conftool/dbconfig/20221129-135526-ladsgroup.json
  • 13:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P41720 and previous config saved to /var/cache/conftool/dbconfig/20221129-134511-marostegui.json
  • 13:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P41719 and previous config saved to /var/cache/conftool/dbconfig/20221129-134019-ladsgroup.json
  • 13:34 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db1205.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:33 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db1204.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:32 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:32 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for db120[4-5] - pt1979@cumin2002"
  • 13:30 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for db120[4-5] - pt1979@cumin2002"
  • 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P41718 and previous config saved to /var/cache/conftool/dbconfig/20221129-133005-marostegui.json
  • 13:28 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P41717 and previous config saved to /var/cache/conftool/dbconfig/20221129-132513-ladsgroup.json
  • 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T321126)', diff saved to https://phabricator.wikimedia.org/P41715 and previous config saved to /var/cache/conftool/dbconfig/20221129-131459-marostegui.json
  • 13:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T321126)', diff saved to https://phabricator.wikimedia.org/P41714 and previous config saved to /var/cache/conftool/dbconfig/20221129-131238-marostegui.json
  • 13:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 13:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 13:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 (T321126)', diff saved to https://phabricator.wikimedia.org/P41713 and previous config saved to /var/cache/conftool/dbconfig/20221129-131216-marostegui.json
  • 13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T323907)', diff saved to https://phabricator.wikimedia.org/P41712 and previous config saved to /var/cache/conftool/dbconfig/20221129-131006-ladsgroup.json
  • 13:00 moritzm: installing glibc security updates on buster
  • 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P41711 and previous config saved to /var/cache/conftool/dbconfig/20221129-125710-marostegui.json
  • 12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 12:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 12:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T323907)', diff saved to https://phabricator.wikimedia.org/P41710 and previous config saved to /var/cache/conftool/dbconfig/20221129-125121-ladsgroup.json
  • 12:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P41709 and previous config saved to /var/cache/conftool/dbconfig/20221129-124203-marostegui.json
  • 12:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P41708 and previous config saved to /var/cache/conftool/dbconfig/20221129-123614-ladsgroup.json
  • 12:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 (T323907)', diff saved to https://phabricator.wikimedia.org/P41707 and previous config saved to /var/cache/conftool/dbconfig/20221129-123134-ladsgroup.json
  • 12:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 12:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 12:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T323907)', diff saved to https://phabricator.wikimedia.org/P41706 and previous config saved to /var/cache/conftool/dbconfig/20221129-123113-ladsgroup.json
  • 12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 (T321126)', diff saved to https://phabricator.wikimedia.org/P41705 and previous config saved to /var/cache/conftool/dbconfig/20221129-122657-marostegui.json
  • 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1132 (T321126)', diff saved to https://phabricator.wikimedia.org/P41704 and previous config saved to /var/cache/conftool/dbconfig/20221129-122436-marostegui.json
  • 12:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1132.eqiad.wmnet with reason: Maintenance
  • 12:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1132.eqiad.wmnet with reason: Maintenance
  • 12:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T321126)', diff saved to https://phabricator.wikimedia.org/P41703 and previous config saved to /var/cache/conftool/dbconfig/20221129-122414-marostegui.json
  • 12:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P41702 and previous config saved to /var/cache/conftool/dbconfig/20221129-122108-ladsgroup.json
  • 12:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P41701 and previous config saved to /var/cache/conftool/dbconfig/20221129-121606-ladsgroup.json
  • 12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T322618)', diff saved to https://phabricator.wikimedia.org/P41700 and previous config saved to /var/cache/conftool/dbconfig/20221129-121354-ladsgroup.json
  • 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P41699 and previous config saved to /var/cache/conftool/dbconfig/20221129-120907-marostegui.json
  • 12:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T323907)', diff saved to https://phabricator.wikimedia.org/P41698 and previous config saved to /var/cache/conftool/dbconfig/20221129-120601-ladsgroup.json
  • 12:05 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
  • 12:04 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
  • 12:03 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
  • 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P41697 and previous config saved to /var/cache/conftool/dbconfig/20221129-120100-ladsgroup.json
  • 11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P41696 and previous config saved to /var/cache/conftool/dbconfig/20221129-115847-ladsgroup.json
  • 11:54 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet
  • 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P41695 and previous config saved to /var/cache/conftool/dbconfig/20221129-115401-marostegui.json
  • 11:53 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
  • 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetdb2003.codfw.wmnet
  • 11:47 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet
  • 11:47 marostegui: Drop scholarships database from m2 T243037
  • 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetdb2003.codfw.wmnet
  • 11:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T323907)', diff saved to https://phabricator.wikimedia.org/P41694 and previous config saved to /var/cache/conftool/dbconfig/20221129-114553-ladsgroup.json
  • 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P41693 and previous config saved to /var/cache/conftool/dbconfig/20221129-114341-ladsgroup.json
  • 11:43 godog: +100G to global/prometheus in eqiad
  • 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T321126)', diff saved to https://phabricator.wikimedia.org/P41692 and previous config saved to /var/cache/conftool/dbconfig/20221129-113854-marostegui.json
  • 11:37 moritzm: uploaded ferm 2.5.1-1.1+wmf11u1 to apt.wikimedia.org/bookworm (rebasing our systemd startup fixes to what's in bookworm) T321783
  • 11:37 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
  • 11:37 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
  • 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1128 (T321126)', diff saved to https://phabricator.wikimedia.org/P41691 and previous config saved to /var/cache/conftool/dbconfig/20221129-113633-marostegui.json
  • 11:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1128.eqiad.wmnet with reason: Maintenance
  • 11:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1128.eqiad.wmnet with reason: Maintenance
  • 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T321126)', diff saved to https://phabricator.wikimedia.org/P41690 and previous config saved to /var/cache/conftool/dbconfig/20221129-113612-marostegui.json
  • 11:34 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
  • 11:34 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
  • 11:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T322618)', diff saved to https://phabricator.wikimedia.org/P41689 and previous config saved to /var/cache/conftool/dbconfig/20221129-112835-ladsgroup.json
  • 11:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P41688 and previous config saved to /var/cache/conftool/dbconfig/20221129-112106-marostegui.json
  • 11:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 (T322618)', diff saved to https://phabricator.wikimedia.org/P41687 and previous config saved to /var/cache/conftool/dbconfig/20221129-112053-ladsgroup.json
  • 11:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 11:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 11:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T322618)', diff saved to https://phabricator.wikimedia.org/P41686 and previous config saved to /var/cache/conftool/dbconfig/20221129-112043-ladsgroup.json
  • 11:10 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 42 hosts
  • 11:10 oblivian@cumin1001: START - Cookbook sre.hosts.remove-downtime for 42 hosts
  • 11:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2121 (T323907)', diff saved to https://phabricator.wikimedia.org/P41685 and previous config saved to /var/cache/conftool/dbconfig/20221129-110926-ladsgroup.json
  • 11:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 11:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 11:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T323907)', diff saved to https://phabricator.wikimedia.org/P41684 and previous config saved to /var/cache/conftool/dbconfig/20221129-110905-ladsgroup.json
  • 11:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P41683 and previous config saved to /var/cache/conftool/dbconfig/20221129-110559-marostegui.json
  • 11:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1202 (T323907)', diff saved to https://phabricator.wikimedia.org/P41682 and previous config saved to /var/cache/conftool/dbconfig/20221129-110546-ladsgroup.json
  • 11:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P41681 and previous config saved to /var/cache/conftool/dbconfig/20221129-110537-ladsgroup.json
  • 11:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 11:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 11:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T323907)', diff saved to https://phabricator.wikimedia.org/P41680 and previous config saved to /var/cache/conftool/dbconfig/20221129-110518-ladsgroup.json
  • 10:58 oblivian@puppetmaster1001: conftool action : set/weight=10; selector: cluster=(jobrunner|videoscaler),dc=eqiad,name=mw14[5-9].*
  • 10:55 _joe_: new appservers are in rotation T313327
  • 10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P41678 and previous config saved to /var/cache/conftool/dbconfig/20221129-105358-ladsgroup.json
  • 10:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T321126)', diff saved to https://phabricator.wikimedia.org/P41677 and previous config saved to /var/cache/conftool/dbconfig/20221129-105050-marostegui.json
  • 10:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P41676 and previous config saved to /var/cache/conftool/dbconfig/20221129-105030-ladsgroup.json
  • 10:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P41675 and previous config saved to /var/cache/conftool/dbconfig/20221129-105011-ladsgroup.json
  • 10:49 oblivian@puppetmaster1001: conftool action : set/weight=30; selector: cluster=api_appserver,dc=eqiad,name=mw14[6-9].*
  • 10:48 oblivian@puppetmaster1001: conftool action : set/weight=30; selector: cluster=appserver,dc=eqiad,name=mw14[7-9].*
  • 10:48 hnowlan: stopping puppet on maps* for casssandra removal
  • 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T321126)', diff saved to https://phabricator.wikimedia.org/P41674 and previous config saved to /var/cache/conftool/dbconfig/20221129-104828-marostegui.json
  • 10:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 10:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 (T321126)', diff saved to https://phabricator.wikimedia.org/P41673 and previous config saved to /var/cache/conftool/dbconfig/20221129-104807-marostegui.json
  • 10:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P41672 and previous config saved to /var/cache/conftool/dbconfig/20221129-103852-ladsgroup.json
  • 10:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T322618)', diff saved to https://phabricator.wikimedia.org/P41671 and previous config saved to /var/cache/conftool/dbconfig/20221129-103524-ladsgroup.json
  • 10:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P41670 and previous config saved to /var/cache/conftool/dbconfig/20221129-103505-ladsgroup.json
  • 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P41669 and previous config saved to /var/cache/conftool/dbconfig/20221129-103301-marostegui.json
  • 10:30 jynus: revoke temporary grants to scholarships for backups on db1117, db2160 T243037
  • 10:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2156 (T322618)', diff saved to https://phabricator.wikimedia.org/P41668 and previous config saved to /var/cache/conftool/dbconfig/20221129-102746-ladsgroup.json
  • 10:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 10:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 10:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 10:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 10:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T322618)', diff saved to https://phabricator.wikimedia.org/P41667 and previous config saved to /var/cache/conftool/dbconfig/20221129-102731-ladsgroup.json
  • 10:26 elukey: restart kube-apiserver on ml-serve-ctrl* to clear out some knative controller issue
  • 10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T323907)', diff saved to https://phabricator.wikimedia.org/P41666 and previous config saved to /var/cache/conftool/dbconfig/20221129-102345-ladsgroup.json
  • 10:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T323907)', diff saved to https://phabricator.wikimedia.org/P41665 and previous config saved to /var/cache/conftool/dbconfig/20221129-101958-ladsgroup.json
  • 10:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P41664 and previous config saved to /var/cache/conftool/dbconfig/20221129-101754-marostegui.json
  • 10:15 marostegui@cumin1001: dbctl commit (dc=all): 'db2145 (re)pooling @ 100%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P41663 and previous config saved to /var/cache/conftool/dbconfig/20221129-101554-root.json
  • 10:15 moritzm: upgrading puppetdb2003 to bookworm T321783
  • 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P41662 and previous config saved to /var/cache/conftool/dbconfig/20221129-101225-ladsgroup.json
  • 10:07 jynus: add temporary grants to scholarships for backups on db1117, db2160 T243037
  • 10:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1194 (T323907)', diff saved to https://phabricator.wikimedia.org/P41661 and previous config saved to /var/cache/conftool/dbconfig/20221129-100319-ladsgroup.json
  • 10:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 10:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 10:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T323907)', diff saved to https://phabricator.wikimedia.org/P41660 and previous config saved to /var/cache/conftool/dbconfig/20221129-100258-ladsgroup.json
  • 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 (T321126)', diff saved to https://phabricator.wikimedia.org/P41659 and previous config saved to /var/cache/conftool/dbconfig/20221129-100248-marostegui.json
  • 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'db2145 (re)pooling @ 75%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P41658 and previous config saved to /var/cache/conftool/dbconfig/20221129-100049-root.json
  • 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1118 (T321126)', diff saved to https://phabricator.wikimedia.org/P41657 and previous config saved to /var/cache/conftool/dbconfig/20221129-100025-marostegui.json
  • 10:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1118.eqiad.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1118.eqiad.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107 (T321126)', diff saved to https://phabricator.wikimedia.org/P41656 and previous config saved to /var/cache/conftool/dbconfig/20221129-095931-marostegui.json
  • 09:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P41655 and previous config saved to /var/cache/conftool/dbconfig/20221129-095718-ladsgroup.json
  • 09:56 moritzm: installing curl security updates
  • 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2120 (T323907)', diff saved to https://phabricator.wikimedia.org/P41654 and previous config saved to /var/cache/conftool/dbconfig/20221129-094818-ladsgroup.json
  • 09:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 09:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T323907)', diff saved to https://phabricator.wikimedia.org/P41653 and previous config saved to /var/cache/conftool/dbconfig/20221129-094757-ladsgroup.json
  • 09:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P41652 and previous config saved to /var/cache/conftool/dbconfig/20221129-094745-ladsgroup.json
  • 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'db2145 (re)pooling @ 50%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P41651 and previous config saved to /var/cache/conftool/dbconfig/20221129-094544-root.json
  • 09:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P41650 and previous config saved to /var/cache/conftool/dbconfig/20221129-094424-marostegui.json
  • 09:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T322618)', diff saved to https://phabricator.wikimedia.org/P41649 and previous config saved to /var/cache/conftool/dbconfig/20221129-094212-ladsgroup.json
  • 09:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 (T322618)', diff saved to https://phabricator.wikimedia.org/P41648 and previous config saved to /var/cache/conftool/dbconfig/20221129-093420-ladsgroup.json
  • 09:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 09:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 09:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P41647 and previous config saved to /var/cache/conftool/dbconfig/20221129-093250-ladsgroup.json
  • 09:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P41646 and previous config saved to /var/cache/conftool/dbconfig/20221129-093239-ladsgroup.json
  • 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'db2145 (re)pooling @ 25%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P41645 and previous config saved to /var/cache/conftool/dbconfig/20221129-093039-root.json
  • 09:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P41644 and previous config saved to /var/cache/conftool/dbconfig/20221129-092918-marostegui.json
  • 09:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 09:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 09:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T322618)', diff saved to https://phabricator.wikimedia.org/P41643 and previous config saved to /var/cache/conftool/dbconfig/20221129-092822-ladsgroup.json
  • 09:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P41642 and previous config saved to /var/cache/conftool/dbconfig/20221129-091744-ladsgroup.json
  • 09:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T323907)', diff saved to https://phabricator.wikimedia.org/P41641 and previous config saved to /var/cache/conftool/dbconfig/20221129-091732-ladsgroup.json
  • 09:17 moritzm: update component/puppetdb7 to puppetdb 7.11.2-3 (fixing Postgres 15 compat) T321783
  • 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'db2145 (re)pooling @ 10%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P41640 and previous config saved to /var/cache/conftool/dbconfig/20221129-091534-root.json
  • 09:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107 (T321126)', diff saved to https://phabricator.wikimedia.org/P41639 and previous config saved to /var/cache/conftool/dbconfig/20221129-091412-marostegui.json
  • 09:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P41638 and previous config saved to /var/cache/conftool/dbconfig/20221129-091315-ladsgroup.json
  • 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2145', diff saved to https://phabricator.wikimedia.org/P41637 and previous config saved to /var/cache/conftool/dbconfig/20221129-091224-marostegui.json
  • 09:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1107 (T321126)', diff saved to https://phabricator.wikimedia.org/P41636 and previous config saved to /var/cache/conftool/dbconfig/20221129-091149-marostegui.json
  • 09:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1107.eqiad.wmnet with reason: Maintenance
  • 09:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1107.eqiad.wmnet with reason: Maintenance
  • 09:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T321126)', diff saved to https://phabricator.wikimedia.org/P41635 and previous config saved to /var/cache/conftool/dbconfig/20221129-091117-marostegui.json
  • 09:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T323907)', diff saved to https://phabricator.wikimedia.org/P41634 and previous config saved to /var/cache/conftool/dbconfig/20221129-090237-ladsgroup.json
  • 09:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1191 (T323907)', diff saved to https://phabricator.wikimedia.org/P41633 and previous config saved to /var/cache/conftool/dbconfig/20221129-090044-ladsgroup.json
  • 09:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 09:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 09:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T323907)', diff saved to https://phabricator.wikimedia.org/P41632 and previous config saved to /var/cache/conftool/dbconfig/20221129-090023-ladsgroup.json
  • 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P41631 and previous config saved to /var/cache/conftool/dbconfig/20221129-085809-ladsgroup.json
  • 08:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P41630 and previous config saved to /var/cache/conftool/dbconfig/20221129-085611-marostegui.json
  • 08:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P41629 and previous config saved to /var/cache/conftool/dbconfig/20221129-084517-ladsgroup.json
  • 08:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T322618)', diff saved to https://phabricator.wikimedia.org/P41628 and previous config saved to /var/cache/conftool/dbconfig/20221129-084302-ladsgroup.json
  • 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P41627 and previous config saved to /var/cache/conftool/dbconfig/20221129-084104-marostegui.json
  • 08:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 (T322618)', diff saved to https://phabricator.wikimedia.org/P41626 and previous config saved to /var/cache/conftool/dbconfig/20221129-083521-ladsgroup.json
  • 08:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 08:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 08:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T322618)', diff saved to https://phabricator.wikimedia.org/P41625 and previous config saved to /var/cache/conftool/dbconfig/20221129-083511-ladsgroup.json
  • 08:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P41624 and previous config saved to /var/cache/conftool/dbconfig/20221129-083010-ladsgroup.json
  • 08:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2108 (T323907)', diff saved to https://phabricator.wikimedia.org/P41623 and previous config saved to /var/cache/conftool/dbconfig/20221129-082740-ladsgroup.json
  • 08:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 08:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T321126)', diff saved to https://phabricator.wikimedia.org/P41622 and previous config saved to /var/cache/conftool/dbconfig/20221129-082558-marostegui.json
  • 08:24 oblivian@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host mw1457.eqiad.wmnet
  • 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T321126)', diff saved to https://phabricator.wikimedia.org/P41621 and previous config saved to /var/cache/conftool/dbconfig/20221129-082335-marostegui.json
  • 08:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T321126)', diff saved to https://phabricator.wikimedia.org/P41620 and previous config saved to /var/cache/conftool/dbconfig/20221129-082307-marostegui.json
  • 08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P41619 and previous config saved to /var/cache/conftool/dbconfig/20221129-082004-ladsgroup.json
  • 08:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T323907)', diff saved to https://phabricator.wikimedia.org/P41618 and previous config saved to /var/cache/conftool/dbconfig/20221129-081504-ladsgroup.json
  • 08:13 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single for host mw1457.eqiad.wmnet
  • 08:13 moritzm: rebalance Ganeti group D/codfw following reboots
  • 08:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P41614 and previous config saved to /var/cache/conftool/dbconfig/20221129-080801-marostegui.json
  • 08:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P41613 and previous config saved to /var/cache/conftool/dbconfig/20221129-080458-ladsgroup.json
  • 08:03 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 42 hosts with reason: Appservers
  • 08:00 oblivian@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 42 hosts with reason: Appservers
  • 07:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T323907)', diff saved to https://phabricator.wikimedia.org/P41612 and previous config saved to /var/cache/conftool/dbconfig/20221129-075937-ladsgroup.json
  • 07:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 07:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 07:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T323907)', diff saved to https://phabricator.wikimedia.org/P41611 and previous config saved to /var/cache/conftool/dbconfig/20221129-075854-ladsgroup.json
  • 07:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 07:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P41610 and previous config saved to /var/cache/conftool/dbconfig/20221129-075254-marostegui.json
  • 07:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T322618)', diff saved to https://phabricator.wikimedia.org/P41609 and previous config saved to /var/cache/conftool/dbconfig/20221129-074951-ladsgroup.json
  • 07:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2174 (re)pooling @ 100%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P41608 and previous config saved to /var/cache/conftool/dbconfig/20221129-074441-root.json
  • 07:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P41607 and previous config saved to /var/cache/conftool/dbconfig/20221129-074347-ladsgroup.json
  • 07:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 (T322618)', diff saved to https://phabricator.wikimedia.org/P41606 and previous config saved to /var/cache/conftool/dbconfig/20221129-074229-ladsgroup.json
  • 07:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 07:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T321126)', diff saved to https://phabricator.wikimedia.org/P41605 and previous config saved to /var/cache/conftool/dbconfig/20221129-073748-marostegui.json
  • 07:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T323907)', diff saved to https://phabricator.wikimedia.org/P41604 and previous config saved to /var/cache/conftool/dbconfig/20221129-073706-ladsgroup.json
  • 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T321126)', diff saved to https://phabricator.wikimedia.org/P41603 and previous config saved to /var/cache/conftool/dbconfig/20221129-073525-marostegui.json
  • 07:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 07:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T321126)', diff saved to https://phabricator.wikimedia.org/P41602 and previous config saved to /var/cache/conftool/dbconfig/20221129-073504-marostegui.json
  • 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2174 (re)pooling @ 75%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P41601 and previous config saved to /var/cache/conftool/dbconfig/20221129-072936-root.json
  • 07:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P41600 and previous config saved to /var/cache/conftool/dbconfig/20221129-072841-ladsgroup.json
  • 07:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 07:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 07:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 07:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 07:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P41599 and previous config saved to /var/cache/conftool/dbconfig/20221129-072159-ladsgroup.json
  • 07:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P41598 and previous config saved to /var/cache/conftool/dbconfig/20221129-071958-marostegui.json
  • 07:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 07:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2174 (re)pooling @ 50%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P41597 and previous config saved to /var/cache/conftool/dbconfig/20221129-071431-root.json
  • 07:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 07:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 07:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T323907)', diff saved to https://phabricator.wikimedia.org/P41596 and previous config saved to /var/cache/conftool/dbconfig/20221129-071334-ladsgroup.json
  • 07:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 07:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 07:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P41595 and previous config saved to /var/cache/conftool/dbconfig/20221129-070653-ladsgroup.json
  • 07:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1123 T323546', diff saved to https://phabricator.wikimedia.org/P41594 and previous config saved to /var/cache/conftool/dbconfig/20221129-070637-ladsgroup.json
  • 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P41593 and previous config saved to /var/cache/conftool/dbconfig/20221129-070451-marostegui.json
  • 07:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1157 to s3 primary and set section read-write T323546', diff saved to https://phabricator.wikimedia.org/P41592 and previous config saved to /var/cache/conftool/dbconfig/20221129-070102-ladsgroup.json
  • 07:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - T323546', diff saved to https://phabricator.wikimedia.org/P41591 and previous config saved to /var/cache/conftool/dbconfig/20221129-070032-ladsgroup.json
  • 07:00 Amir1: Starting s3 eqiad failover from db1123 to db1157 - T323546
  • 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2174 (re)pooling @ 25%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P41590 and previous config saved to /var/cache/conftool/dbconfig/20221129-065926-root.json
  • 06:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T323907)', diff saved to https://phabricator.wikimedia.org/P41589 and previous config saved to /var/cache/conftool/dbconfig/20221129-065741-ladsgroup.json
  • 06:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 06:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 06:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T323907)', diff saved to https://phabricator.wikimedia.org/P41588 and previous config saved to /var/cache/conftool/dbconfig/20221129-065147-ladsgroup.json
  • 06:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T321126)', diff saved to https://phabricator.wikimedia.org/P41587 and previous config saved to /var/cache/conftool/dbconfig/20221129-064945-marostegui.json
  • 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T321126)', diff saved to https://phabricator.wikimedia.org/P41586 and previous config saved to /var/cache/conftool/dbconfig/20221129-064721-marostegui.json
  • 06:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 06:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1138.eqiad.wmnet with reason: Maintenance
  • 06:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1138.eqiad.wmnet with reason: Maintenance
  • 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2174 (re)pooling @ 10%: After HW maintenance', diff saved to https://phabricator.wikimedia.org/P41585 and previous config saved to /var/cache/conftool/dbconfig/20221129-064421-root.json
  • 06:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 06:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 06:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 06:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 06:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41584 and previous config saved to /var/cache/conftool/dbconfig/20221129-062549-ladsgroup.json
  • 06:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 (T323907)', diff saved to https://phabricator.wikimedia.org/P41583 and previous config saved to /var/cache/conftool/dbconfig/20221129-062533-ladsgroup.json
  • 06:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 06:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 06:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T323907)', diff saved to https://phabricator.wikimedia.org/P41582 and previous config saved to /var/cache/conftool/dbconfig/20221129-062523-ladsgroup.json
  • 06:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P41581 and previous config saved to /var/cache/conftool/dbconfig/20221129-061043-ladsgroup.json
  • 06:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P41580 and previous config saved to /var/cache/conftool/dbconfig/20221129-061016-ladsgroup.json
  • 05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P41579 and previous config saved to /var/cache/conftool/dbconfig/20221129-055536-ladsgroup.json
  • 05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P41578 and previous config saved to /var/cache/conftool/dbconfig/20221129-055510-ladsgroup.json
  • 05:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1157 with weight 0 T323546', diff saved to https://phabricator.wikimedia.org/P41577 and previous config saved to /var/cache/conftool/dbconfig/20221129-054717-ladsgroup.json
  • 05:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 23 hosts with reason: Primary switchover s3 T323546
  • 05:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 23 hosts with reason: Primary switchover s3 T323546
  • 05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41576 and previous config saved to /var/cache/conftool/dbconfig/20221129-054029-ladsgroup.json
  • 05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T323907)', diff saved to https://phabricator.wikimedia.org/P41575 and previous config saved to /var/cache/conftool/dbconfig/20221129-054003-ladsgroup.json
  • 05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2156 (T323907)', diff saved to https://phabricator.wikimedia.org/P41574 and previous config saved to /var/cache/conftool/dbconfig/20221129-052538-ladsgroup.json
  • 05:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 05:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 05:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 05:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T323907)', diff saved to https://phabricator.wikimedia.org/P41573 and previous config saved to /var/cache/conftool/dbconfig/20221129-052512-ladsgroup.json
  • 05:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 05:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 05:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T322618)', diff saved to https://phabricator.wikimedia.org/P41572 and previous config saved to /var/cache/conftool/dbconfig/20221129-052004-ladsgroup.json
  • 05:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P41571 and previous config saved to /var/cache/conftool/dbconfig/20221129-051006-ladsgroup.json
  • 05:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41570 and previous config saved to /var/cache/conftool/dbconfig/20221129-050458-ladsgroup.json
  • 05:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41569 and previous config saved to /var/cache/conftool/dbconfig/20221129-050453-ladsgroup.json
  • 05:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 05:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 05:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T323907)', diff saved to https://phabricator.wikimedia.org/P41568 and previous config saved to /var/cache/conftool/dbconfig/20221129-050431-ladsgroup.json
  • 04:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P41567 and previous config saved to /var/cache/conftool/dbconfig/20221129-045459-ladsgroup.json
  • 04:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41566 and previous config saved to /var/cache/conftool/dbconfig/20221129-044952-ladsgroup.json
  • 04:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P41565 and previous config saved to /var/cache/conftool/dbconfig/20221129-044924-ladsgroup.json
  • 04:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T323907)', diff saved to https://phabricator.wikimedia.org/P41564 and previous config saved to /var/cache/conftool/dbconfig/20221129-043953-ladsgroup.json
  • 04:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T322618)', diff saved to https://phabricator.wikimedia.org/P41563 and previous config saved to /var/cache/conftool/dbconfig/20221129-043445-ladsgroup.json
  • 04:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P41562 and previous config saved to /var/cache/conftool/dbconfig/20221129-043418-ladsgroup.json
  • 04:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 (T322618)', diff saved to https://phabricator.wikimedia.org/P41561 and previous config saved to /var/cache/conftool/dbconfig/20221129-043050-ladsgroup.json
  • 04:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 04:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 04:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T322618)', diff saved to https://phabricator.wikimedia.org/P41560 and previous config saved to /var/cache/conftool/dbconfig/20221129-043040-ladsgroup.json
  • 04:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T323907)', diff saved to https://phabricator.wikimedia.org/P41559 and previous config saved to /var/cache/conftool/dbconfig/20221129-041912-ladsgroup.json
  • 04:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41558 and previous config saved to /var/cache/conftool/dbconfig/20221129-041534-ladsgroup.json
  • 04:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 (T323907)', diff saved to https://phabricator.wikimedia.org/P41557 and previous config saved to /var/cache/conftool/dbconfig/20221129-041332-ladsgroup.json
  • 04:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 04:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 04:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 04:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 04:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 04:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 04:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41556 and previous config saved to /var/cache/conftool/dbconfig/20221129-040027-ladsgroup.json
  • 03:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T323907)', diff saved to https://phabricator.wikimedia.org/P41555 and previous config saved to /var/cache/conftool/dbconfig/20221129-035144-ladsgroup.json
  • 03:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 03:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 03:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 03:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 03:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T323907)', diff saved to https://phabricator.wikimedia.org/P41554 and previous config saved to /var/cache/conftool/dbconfig/20221129-035116-ladsgroup.json
  • 03:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T322618)', diff saved to https://phabricator.wikimedia.org/P41553 and previous config saved to /var/cache/conftool/dbconfig/20221129-034521-ladsgroup.json
  • 03:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 (T322618)', diff saved to https://phabricator.wikimedia.org/P41552 and previous config saved to /var/cache/conftool/dbconfig/20221129-034126-ladsgroup.json
  • 03:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 03:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 03:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T322618)', diff saved to https://phabricator.wikimedia.org/P41551 and previous config saved to /var/cache/conftool/dbconfig/20221129-034116-ladsgroup.json
  • 03:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P41550 and previous config saved to /var/cache/conftool/dbconfig/20221129-033610-ladsgroup.json
  • 03:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P41549 and previous config saved to /var/cache/conftool/dbconfig/20221129-032609-ladsgroup.json
  • 03:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P41548 and previous config saved to /var/cache/conftool/dbconfig/20221129-032103-ladsgroup.json
  • 03:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P41547 and previous config saved to /var/cache/conftool/dbconfig/20221129-031103-ladsgroup.json
  • 03:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 03:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 03:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 03:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 03:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T323907)', diff saved to https://phabricator.wikimedia.org/P41546 and previous config saved to /var/cache/conftool/dbconfig/20221129-030557-ladsgroup.json
  • 02:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T322618)', diff saved to https://phabricator.wikimedia.org/P41545 and previous config saved to /var/cache/conftool/dbconfig/20221129-025556-ladsgroup.json
  • 02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T322618)', diff saved to https://phabricator.wikimedia.org/P41544 and previous config saved to /var/cache/conftool/dbconfig/20221129-025201-ladsgroup.json
  • 02:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 02:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 02:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T322618)', diff saved to https://phabricator.wikimedia.org/P41543 and previous config saved to /var/cache/conftool/dbconfig/20221129-025151-ladsgroup.json
  • 02:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P41542 and previous config saved to /var/cache/conftool/dbconfig/20221129-023644-ladsgroup.json
  • 02:32 ejegg: civicrm upgraded from efff01e9 to 80edaccc
  • 02:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T323907)', diff saved to https://phabricator.wikimedia.org/P41541 and previous config saved to /var/cache/conftool/dbconfig/20221129-022657-ladsgroup.json
  • 02:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 02:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 02:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41540 and previous config saved to /var/cache/conftool/dbconfig/20221129-022636-ladsgroup.json
  • 02:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P41539 and previous config saved to /var/cache/conftool/dbconfig/20221129-022138-ladsgroup.json
  • 02:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P41538 and previous config saved to /var/cache/conftool/dbconfig/20221129-021129-ladsgroup.json
  • 02:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T322618)', diff saved to https://phabricator.wikimedia.org/P41537 and previous config saved to /var/cache/conftool/dbconfig/20221129-020631-ladsgroup.json
  • 02:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T322618)', diff saved to https://phabricator.wikimedia.org/P41536 and previous config saved to /var/cache/conftool/dbconfig/20221129-020237-ladsgroup.json
  • 02:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 02:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 02:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T322618)', diff saved to https://phabricator.wikimedia.org/P41535 and previous config saved to /var/cache/conftool/dbconfig/20221129-020226-ladsgroup.json
  • 01:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P41534 and previous config saved to /var/cache/conftool/dbconfig/20221129-015623-ladsgroup.json
  • 01:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P41533 and previous config saved to /var/cache/conftool/dbconfig/20221129-014720-ladsgroup.json
  • 01:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41532 and previous config saved to /var/cache/conftool/dbconfig/20221129-014116-ladsgroup.json
  • 01:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P41531 and previous config saved to /var/cache/conftool/dbconfig/20221129-013213-ladsgroup.json
  • 01:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1054']
  • 01:26 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1054']
  • 01:26 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1054']
  • 01:26 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1054']
  • 01:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T322618)', diff saved to https://phabricator.wikimedia.org/P41530 and previous config saved to /var/cache/conftool/dbconfig/20221129-011707-ladsgroup.json
  • 01:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T322618)', diff saved to https://phabricator.wikimedia.org/P41529 and previous config saved to /var/cache/conftool/dbconfig/20221129-011312-ladsgroup.json
  • 01:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 01:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 01:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T322618)', diff saved to https://phabricator.wikimedia.org/P41528 and previous config saved to /var/cache/conftool/dbconfig/20221129-011302-ladsgroup.json
  • 01:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 01:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 01:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T323907)', diff saved to https://phabricator.wikimedia.org/P41527 and previous config saved to /var/cache/conftool/dbconfig/20221129-011227-ladsgroup.json
  • 01:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T321126)', diff saved to https://phabricator.wikimedia.org/P41526 and previous config saved to /var/cache/conftool/dbconfig/20221129-010332-marostegui.json
  • 00:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P41525 and previous config saved to /var/cache/conftool/dbconfig/20221129-005755-ladsgroup.json
  • 00:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P41524 and previous config saved to /var/cache/conftool/dbconfig/20221129-005720-ladsgroup.json
  • 00:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P41522 and previous config saved to /var/cache/conftool/dbconfig/20221129-004825-marostegui.json
  • 00:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P41521 and previous config saved to /var/cache/conftool/dbconfig/20221129-004249-ladsgroup.json
  • 00:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P41520 and previous config saved to /var/cache/conftool/dbconfig/20221129-004214-ladsgroup.json
  • 00:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41519 and previous config saved to /var/cache/conftool/dbconfig/20221129-003804-ladsgroup.json
  • 00:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 00:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 00:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41518 and previous config saved to /var/cache/conftool/dbconfig/20221129-003742-ladsgroup.json
  • 00:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P41517 and previous config saved to /var/cache/conftool/dbconfig/20221129-003319-marostegui.json
  • 00:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host arclamp1001.eqiad.wmnet with OS bullseye
  • 00:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T322618)', diff saved to https://phabricator.wikimedia.org/P41516 and previous config saved to /var/cache/conftool/dbconfig/20221129-002742-ladsgroup.json
  • 00:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T323907)', diff saved to https://phabricator.wikimedia.org/P41515 and previous config saved to /var/cache/conftool/dbconfig/20221129-002707-ladsgroup.json
  • 00:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P41514 and previous config saved to /var/cache/conftool/dbconfig/20221129-002236-ladsgroup.json
  • 00:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T321126)', diff saved to https://phabricator.wikimedia.org/P41513 and previous config saved to /var/cache/conftool/dbconfig/20221129-001812-marostegui.json
  • 00:16 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on arclamp1001.eqiad.wmnet with reason: host reimage
  • 00:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2179 (T321126)', diff saved to https://phabricator.wikimedia.org/P41512 and previous config saved to /var/cache/conftool/dbconfig/20221129-001559-marostegui.json
  • 00:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 00:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 00:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T321126)', diff saved to https://phabricator.wikimedia.org/P41511 and previous config saved to /var/cache/conftool/dbconfig/20221129-001548-marostegui.json
  • 00:12 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on arclamp1001.eqiad.wmnet with reason: host reimage
  • 00:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P41510 and previous config saved to /var/cache/conftool/dbconfig/20221129-000729-ladsgroup.json
  • 00:07 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host arclamp1001.eqiad.wmnet with OS bullseye
  • 00:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 (T322618)', diff saved to https://phabricator.wikimedia.org/P41509 and previous config saved to /var/cache/conftool/dbconfig/20221129-000545-ladsgroup.json
  • 00:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 00:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 00:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 00:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T322618)', diff saved to https://phabricator.wikimedia.org/P41508 and previous config saved to /var/cache/conftool/dbconfig/20221129-000341-ladsgroup.json
  • 00:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 (T323907)', diff saved to https://phabricator.wikimedia.org/P41507 and previous config saved to /var/cache/conftool/dbconfig/20221129-000153-ladsgroup.json
  • 00:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 00:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 00:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T323907)', diff saved to https://phabricator.wikimedia.org/P41506 and previous config saved to /var/cache/conftool/dbconfig/20221129-000143-ladsgroup.json
  • 00:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P41505 and previous config saved to /var/cache/conftool/dbconfig/20221129-000042-marostegui.json

2022-11-28

  • 23:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 23:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 23:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T323827)', diff saved to https://phabricator.wikimedia.org/P41504 and previous config saved to /var/cache/conftool/dbconfig/20221128-235817-ladsgroup.json
  • 23:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41503 and previous config saved to /var/cache/conftool/dbconfig/20221128-235223-ladsgroup.json
  • 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P41502 and previous config saved to /var/cache/conftool/dbconfig/20221128-234834-ladsgroup.json
  • 23:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P41501 and previous config saved to /var/cache/conftool/dbconfig/20221128-234636-ladsgroup.json
  • 23:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P41500 and previous config saved to /var/cache/conftool/dbconfig/20221128-234535-marostegui.json
  • 23:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P41499 and previous config saved to /var/cache/conftool/dbconfig/20221128-234311-ladsgroup.json
  • 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P41498 and previous config saved to /var/cache/conftool/dbconfig/20221128-233328-ladsgroup.json
  • 23:33 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@d361052]: msearch_daemon: Remove cluster selection/load monitor (duration: 00m 51s)
  • 23:32 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@d361052]: msearch_daemon: Remove cluster selection/load monitor
  • 23:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P41497 and previous config saved to /var/cache/conftool/dbconfig/20221128-233130-ladsgroup.json
  • 23:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T321126)', diff saved to https://phabricator.wikimedia.org/P41496 and previous config saved to /var/cache/conftool/dbconfig/20221128-233028-marostegui.json
  • 23:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T321126)', diff saved to https://phabricator.wikimedia.org/P41495 and previous config saved to /var/cache/conftool/dbconfig/20221128-232815-marostegui.json
  • 23:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 23:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P41494 and previous config saved to /var/cache/conftool/dbconfig/20221128-232805-ladsgroup.json
  • 23:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 23:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T321126)', diff saved to https://phabricator.wikimedia.org/P41493 and previous config saved to /var/cache/conftool/dbconfig/20221128-232754-marostegui.json
  • 23:23 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy config changes for mysql-port-as-string (T280597) (duration: 00m 55s)
  • 23:22 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy config changes for mysql-port-as-string (T280597)
  • 23:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T322618)', diff saved to https://phabricator.wikimedia.org/P41492 and previous config saved to /var/cache/conftool/dbconfig/20221128-231821-ladsgroup.json
  • 23:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T323907)', diff saved to https://phabricator.wikimedia.org/P41491 and previous config saved to /var/cache/conftool/dbconfig/20221128-231623-ladsgroup.json
  • 23:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T323907)', diff saved to https://phabricator.wikimedia.org/P41490 and previous config saved to /var/cache/conftool/dbconfig/20221128-231548-ladsgroup.json
  • 23:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 23:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 23:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T322618)', diff saved to https://phabricator.wikimedia.org/P41489 and previous config saved to /var/cache/conftool/dbconfig/20221128-231426-ladsgroup.json
  • 23:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 23:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 23:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 23:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 23:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T323827)', diff saved to https://phabricator.wikimedia.org/P41488 and previous config saved to /var/cache/conftool/dbconfig/20221128-231258-ladsgroup.json
  • 23:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P41487 and previous config saved to /var/cache/conftool/dbconfig/20221128-231247-marostegui.json
  • 23:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 23:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 22:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221128-225741-marostegui.json
  • 22:56 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts cp5006.eqsin.wmnet
  • 22:56 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:56 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp5006.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 22:54 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp5006.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 22:54 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1001 -> phab1004 (T280597) (duration: 00m 52s)
  • 22:53 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1001 -> phab1004 (T280597)
  • 22:52 sukhe@cumin2002: START - Cookbook sre.dns.netbox
  • 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 (T323907)', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221128-225101-ladsgroup.json
  • 22:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 22:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 22:47 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp5006.eqsin.wmnet
  • 22:42 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5006.eqsin.wmnet with reason: downtimed, to be depooled
  • 22:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T321126)', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221128-224235-marostegui.json
  • 22:42 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp5006.eqsin.wmnet with reason: downtimed, to be depooled
  • 22:42 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet,service=varnish-fe
  • 22:42 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet,service=ats-be
  • 22:42 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet,service=ats-tls
  • 22:41 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts cp[5005,5010].eqsin.wmnet
  • 22:41 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:41 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5005,5010].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 22:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T321126)', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221128-224022-marostegui.json
  • 22:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 22:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 22:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 22:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T321126)', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221128-223956-marostegui.json
  • 22:39 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5005,5010].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 22:37 sukhe@cumin2002: START - Cookbook sre.dns.netbox
  • 22:32 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[5005,5010].eqsin.wmnet
  • 22:26 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp[5005,5010].eqsin.wmnet with reason: downtimed, to be depooled
  • 22:26 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp[5005,5010].eqsin.wmnet with reason: downtimed, to be depooled
  • 22:25 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5010.eqsin.wmnet,service=varnish-fe
  • 22:25 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5010.eqsin.wmnet,service=ats-be
  • 22:25 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5010.eqsin.wmnet,service=ats-tls
  • 22:25 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5005.eqsin.wmnet,service=varnish-fe
  • 22:25 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5005.eqsin.wmnet,service=ats-be
  • 22:25 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5005.eqsin.wmnet,service=ats-tls
  • 22:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221128-222450-marostegui.json
  • 22:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 (T323827)', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221128-221242-ladsgroup.json
  • 22:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 22:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 22:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T323827)', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221128-221221-ladsgroup.json
  • 22:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221128-220944-marostegui.json
  • 22:08 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host arclamp1001.eqiad.wmnet with OS bullseye
  • 22:07 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts cp[5004,5009].eqsin.wmnet
  • 22:07 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:07 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5004,5009].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 22:06 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5004,5009].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 22:03 sukhe@cumin2002: START - Cookbook sre.dns.netbox
  • 22:00 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on phab1001.eqiad.wmnet with reason: T322250
  • 22:00 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on phab1001.eqiad.wmnet with reason: T322250
  • 22:00 brennen: phabricator: phab1001 -> phab1004 migration starting soon; downtime expected (T280597)
  • 21:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41486 and previous config saved to /var/cache/conftool/dbconfig/20221128-215715-ladsgroup.json
  • 21:55 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[5004,5009].eqsin.wmnet
  • 21:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T321126)', diff saved to https://phabricator.wikimedia.org/P41485 and previous config saved to /var/cache/conftool/dbconfig/20221128-215435-marostegui.json
  • 21:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T321126)', diff saved to https://phabricator.wikimedia.org/P41484 and previous config saved to /var/cache/conftool/dbconfig/20221128-215223-marostegui.json
  • 21:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 21:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 21:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T321126)', diff saved to https://phabricator.wikimedia.org/P41483 and previous config saved to /var/cache/conftool/dbconfig/20221128-215151-marostegui.json
  • 21:46 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp[5004,5009].eqsin.wmnet with reason: downtimed, to be depooled
  • 21:46 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp[5004,5009].eqsin.wmnet with reason: downtimed, to be depooled
  • 21:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5009.eqsin.wmnet,service=varnish-fe
  • 21:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5009.eqsin.wmnet,service=ats-be
  • 21:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5009.eqsin.wmnet,service=ats-tls
  • 21:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5004.eqsin.wmnet,service=varnish-fe
  • 21:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5004.eqsin.wmnet,service=ats-be
  • 21:44 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5004.eqsin.wmnet,service=ats-tls
  • 21:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41482 and previous config saved to /var/cache/conftool/dbconfig/20221128-214208-ladsgroup.json
  • 21:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P41481 and previous config saved to /var/cache/conftool/dbconfig/20221128-213645-marostegui.json
  • 21:33 cjming: end of UTC late backport window
  • 21:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T323827)', diff saved to https://phabricator.wikimedia.org/P41480 and previous config saved to /var/cache/conftool/dbconfig/20221128-212702-ladsgroup.json
  • 21:23 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp[5003,5008].eqsin.wmnet
  • 21:23 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:23 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5003,5008].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 21:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P41479 and previous config saved to /var/cache/conftool/dbconfig/20221128-212138-marostegui.json
  • 21:20 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5003,5008].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 21:18 sukhe@cumin2002: START - Cookbook sre.dns.netbox
  • 21:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 21:15 cjming@deploy1002: Finished scap: Backport for Enable shared Reading Lists landing page on all wikis. (T313269) (duration: 06m 22s)
  • 21:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 21:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 21:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 21:12 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[5003,5008].eqsin.wmnet
  • 21:10 cjming@deploy1002: cjming and dbrant: Backport for Enable shared Reading Lists landing page on all wikis. (T313269) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 21:09 cjming@deploy1002: Started scap: Backport for Enable shared Reading Lists landing page on all wikis. (T313269)
  • 21:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T321126)', diff saved to https://phabricator.wikimedia.org/P41478 and previous config saved to /var/cache/conftool/dbconfig/20221128-210632-marostegui.json
  • 21:06 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host arclamp1001.eqiad.wmnet with OS bullseye
  • 21:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T321126)', diff saved to https://phabricator.wikimedia.org/P41477 and previous config saved to /var/cache/conftool/dbconfig/20221128-210419-marostegui.json
  • 21:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 21:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 21:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T321126)', diff saved to https://phabricator.wikimedia.org/P41476 and previous config saved to /var/cache/conftool/dbconfig/20221128-210408-marostegui.json
  • 21:02 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5008.eqsin.wmnet with reason: downtimed, to be depooled
  • 21:02 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp5008.eqsin.wmnet with reason: downtimed, to be depooled
  • 21:02 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5008.eqsin.wmnet,service=varnish-fe
  • 21:02 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5008.eqsin.wmnet,service=ats-be
  • 21:02 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5008.eqsin.wmnet,service=ats-tls
  • 21:01 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5003.eqsin.wmnet with reason: downtimed, to be depooled
  • 21:01 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp5003.eqsin.wmnet with reason: downtimed, to be depooled
  • 20:59 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5003.eqsin.wmnet,service=varnish-fe
  • 20:59 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5003.eqsin.wmnet,service=ats-be
  • 20:59 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5003.eqsin.wmnet,service=ats-tls
  • 20:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 20:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 20:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T323907)', diff saved to https://phabricator.wikimedia.org/P41475 and previous config saved to /var/cache/conftool/dbconfig/20221128-205358-ladsgroup.json
  • 20:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 20:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 20:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 20:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 (T323827)', diff saved to https://phabricator.wikimedia.org/P41474 and previous config saved to /var/cache/conftool/dbconfig/20221128-205103-ladsgroup.json
  • 20:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 20:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T323827)', diff saved to https://phabricator.wikimedia.org/P41473 and previous config saved to /var/cache/conftool/dbconfig/20221128-205041-ladsgroup.json
  • 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 20:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P41472 and previous config saved to /var/cache/conftool/dbconfig/20221128-204902-marostegui.json
  • 20:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41471 and previous config saved to /var/cache/conftool/dbconfig/20221128-203851-ladsgroup.json
  • 20:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P41470 and previous config saved to /var/cache/conftool/dbconfig/20221128-203535-ladsgroup.json
  • 20:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P41469 and previous config saved to /var/cache/conftool/dbconfig/20221128-203356-marostegui.json
  • 20:32 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 20:31 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 20:31 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
  • 20:30 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
  • 20:30 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 20:29 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 20:29 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 20:28 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
  • 20:28 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 20:27 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
  • 20:27 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 20:26 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
  • 20:26 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
  • 20:25 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
  • 20:25 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
  • 20:24 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
  • 20:24 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
  • 20:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41468 and previous config saved to /var/cache/conftool/dbconfig/20221128-202345-ladsgroup.json
  • 20:23 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
  • 20:22 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
  • 20:21 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
  • 20:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P41467 and previous config saved to /var/cache/conftool/dbconfig/20221128-202029-ladsgroup.json
  • 20:20 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
  • 20:19 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
  • 20:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T321126)', diff saved to https://phabricator.wikimedia.org/P41466 and previous config saved to /var/cache/conftool/dbconfig/20221128-201849-marostegui.json
  • 20:18 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
  • 20:18 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
  • 20:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T321126)', diff saved to https://phabricator.wikimedia.org/P41465 and previous config saved to /var/cache/conftool/dbconfig/20221128-201636-marostegui.json
  • 20:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 20:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 20:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T321126)', diff saved to https://phabricator.wikimedia.org/P41464 and previous config saved to /var/cache/conftool/dbconfig/20221128-201604-marostegui.json
  • 20:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T323907)', diff saved to https://phabricator.wikimedia.org/P41463 and previous config saved to /var/cache/conftool/dbconfig/20221128-200838-ladsgroup.json
  • 20:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T323827)', diff saved to https://phabricator.wikimedia.org/P41462 and previous config saved to /var/cache/conftool/dbconfig/20221128-200522-ladsgroup.json
  • 20:05 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp5020.eqsin.wmnet,service=ats-be
  • 20:04 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5020.eqsin.wmnet,service=ats-be
  • 20:01 bblack@cumin1001: conftool action : set/pooled=yes; selector: name=cp5028.eqsin.wmnet,service=ats-be
  • 20:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P41461 and previous config saved to /var/cache/conftool/dbconfig/20221128-200058-marostegui.json
  • 20:00 bblack@cumin1001: conftool action : set/pooled=no; selector: name=cp5028.eqsin.wmnet,service=ats-be
  • 19:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 (T323907)', diff saved to https://phabricator.wikimedia.org/P41460 and previous config saved to /var/cache/conftool/dbconfig/20221128-195753-ladsgroup.json
  • 19:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 19:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 19:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T323907)', diff saved to https://phabricator.wikimedia.org/P41459 and previous config saved to /var/cache/conftool/dbconfig/20221128-195731-ladsgroup.json
  • 19:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 19:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 19:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T323827)', diff saved to https://phabricator.wikimedia.org/P41458 and previous config saved to /var/cache/conftool/dbconfig/20221128-194703-ladsgroup.json
  • 19:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 19:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 19:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41457 and previous config saved to /var/cache/conftool/dbconfig/20221128-194642-ladsgroup.json
  • 19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P41456 and previous config saved to /var/cache/conftool/dbconfig/20221128-194551-marostegui.json
  • 19:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41455 and previous config saved to /var/cache/conftool/dbconfig/20221128-194224-ladsgroup.json
  • 19:41 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp[5002,5007].eqsin.wmnet
  • 19:41 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:41 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5002,5007].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 19:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T323827)', diff saved to https://phabricator.wikimedia.org/P41454 and previous config saved to /var/cache/conftool/dbconfig/20221128-193940-ladsgroup.json
  • 19:38 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[5002,5007].eqsin.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 19:31 sukhe@cumin2002: START - Cookbook sre.dns.netbox
  • 19:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P41453 and previous config saved to /var/cache/conftool/dbconfig/20221128-193135-ladsgroup.json
  • 19:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T321126)', diff saved to https://phabricator.wikimedia.org/P41452 and previous config saved to /var/cache/conftool/dbconfig/20221128-193043-marostegui.json
  • 19:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T321126)', diff saved to https://phabricator.wikimedia.org/P41451 and previous config saved to /var/cache/conftool/dbconfig/20221128-192830-marostegui.json
  • 19:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 19:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 19:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T321126)', diff saved to https://phabricator.wikimedia.org/P41450 and previous config saved to /var/cache/conftool/dbconfig/20221128-192758-marostegui.json
  • 19:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41449 and previous config saved to /var/cache/conftool/dbconfig/20221128-192718-ladsgroup.json
  • 19:25 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[5002,5007].eqsin.wmnet
  • 19:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 19:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 19:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 19:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P41448 and previous config saved to /var/cache/conftool/dbconfig/20221128-192433-ladsgroup.json
  • 19:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 19:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P41447 and previous config saved to /var/cache/conftool/dbconfig/20221128-191629-ladsgroup.json
  • 19:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P41446 and previous config saved to /var/cache/conftool/dbconfig/20221128-191251-marostegui.json
  • 19:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T323907)', diff saved to https://phabricator.wikimedia.org/P41445 and previous config saved to /var/cache/conftool/dbconfig/20221128-191211-ladsgroup.json
  • 19:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P41444 and previous config saved to /var/cache/conftool/dbconfig/20221128-190927-ladsgroup.json
  • 19:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41443 and previous config saved to /var/cache/conftool/dbconfig/20221128-190122-ladsgroup.json
  • 19:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 (T323907)', diff saved to https://phabricator.wikimedia.org/P41442 and previous config saved to /var/cache/conftool/dbconfig/20221128-190122-ladsgroup.json
  • 19:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 19:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 19:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T323907)', diff saved to https://phabricator.wikimedia.org/P41441 and previous config saved to /var/cache/conftool/dbconfig/20221128-190101-ladsgroup.json
  • 18:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P41440 and previous config saved to /var/cache/conftool/dbconfig/20221128-185745-marostegui.json
  • 18:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T323827)', diff saved to https://phabricator.wikimedia.org/P41439 and previous config saved to /var/cache/conftool/dbconfig/20221128-185420-ladsgroup.json
  • 18:46 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@276aa70]: relax slas for subgraph and incoming links (duration: 02m 34s)
  • 18:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41438 and previous config saved to /var/cache/conftool/dbconfig/20221128-184603-ladsgroup.json
  • 18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P41437 and previous config saved to /var/cache/conftool/dbconfig/20221128-184554-ladsgroup.json
  • 18:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 18:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T323827)', diff saved to https://phabricator.wikimedia.org/P41436 and previous config saved to /var/cache/conftool/dbconfig/20221128-184535-ladsgroup.json
  • 18:43 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@276aa70]: relax slas for subgraph and incoming links
  • 18:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T321126)', diff saved to https://phabricator.wikimedia.org/P41435 and previous config saved to /var/cache/conftool/dbconfig/20221128-184238-marostegui.json
  • 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T321126)', diff saved to https://phabricator.wikimedia.org/P41434 and previous config saved to /var/cache/conftool/dbconfig/20221128-184025-marostegui.json
  • 18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 18:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T323827)', diff saved to https://phabricator.wikimedia.org/P41433 and previous config saved to /var/cache/conftool/dbconfig/20221128-184017-ladsgroup.json
  • 18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T321126)', diff saved to https://phabricator.wikimedia.org/P41432 and previous config saved to /var/cache/conftool/dbconfig/20221128-184004-marostegui.json
  • 18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 (T323827)', diff saved to https://phabricator.wikimedia.org/P41431 and previous config saved to /var/cache/conftool/dbconfig/20221128-183532-ladsgroup.json
  • 18:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 18:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41430 and previous config saved to /var/cache/conftool/dbconfig/20221128-183511-ladsgroup.json
  • 18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P41429 and previous config saved to /var/cache/conftool/dbconfig/20221128-183048-ladsgroup.json
  • 18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P41428 and previous config saved to /var/cache/conftool/dbconfig/20221128-183028-ladsgroup.json
  • 18:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P41427 and previous config saved to /var/cache/conftool/dbconfig/20221128-182511-ladsgroup.json
  • 18:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P41426 and previous config saved to /var/cache/conftool/dbconfig/20221128-182458-marostegui.json
  • 18:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P41425 and previous config saved to /var/cache/conftool/dbconfig/20221128-182004-ladsgroup.json
  • 18:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T323907)', diff saved to https://phabricator.wikimedia.org/P41424 and previous config saved to /var/cache/conftool/dbconfig/20221128-181541-ladsgroup.json
  • 18:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P41423 and previous config saved to /var/cache/conftool/dbconfig/20221128-181522-ladsgroup.json
  • 18:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P41421 and previous config saved to /var/cache/conftool/dbconfig/20221128-181004-ladsgroup.json
  • 18:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P41420 and previous config saved to /var/cache/conftool/dbconfig/20221128-180951-marostegui.json
  • 18:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P41419 and previous config saved to /var/cache/conftool/dbconfig/20221128-180458-ladsgroup.json
  • 18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T323907)', diff saved to https://phabricator.wikimedia.org/P41418 and previous config saved to /var/cache/conftool/dbconfig/20221128-180452-ladsgroup.json
  • 18:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 18:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T323907)', diff saved to https://phabricator.wikimedia.org/P41417 and previous config saved to /var/cache/conftool/dbconfig/20221128-180431-ladsgroup.json
  • 18:00 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2050.codfw.wmnet with OS bullseye
  • 18:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T323827)', diff saved to https://phabricator.wikimedia.org/P41415 and previous config saved to /var/cache/conftool/dbconfig/20221128-180015-ladsgroup.json
  • 17:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T323827)', diff saved to https://phabricator.wikimedia.org/P41414 and previous config saved to /var/cache/conftool/dbconfig/20221128-175458-ladsgroup.json
  • 17:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T321126)', diff saved to https://phabricator.wikimedia.org/P41413 and previous config saved to /var/cache/conftool/dbconfig/20221128-175445-marostegui.json
  • 17:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2110 (T321126)', diff saved to https://phabricator.wikimedia.org/P41412 and previous config saved to /var/cache/conftool/dbconfig/20221128-175232-marostegui.json
  • 17:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 17:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 17:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T321126)', diff saved to https://phabricator.wikimedia.org/P41411 and previous config saved to /var/cache/conftool/dbconfig/20221128-175210-marostegui.json
  • 17:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41410 and previous config saved to /var/cache/conftool/dbconfig/20221128-174951-ladsgroup.json
  • 17:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P41409 and previous config saved to /var/cache/conftool/dbconfig/20221128-174925-ladsgroup.json
  • 17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T323827)', diff saved to https://phabricator.wikimedia.org/P41408 and previous config saved to /var/cache/conftool/dbconfig/20221128-174324-ladsgroup.json
  • 17:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:43 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
  • 17:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 17:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 17:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41407 and previous config saved to /var/cache/conftool/dbconfig/20221128-174213-ladsgroup.json
  • 17:39 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
  • 17:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P41406 and previous config saved to /var/cache/conftool/dbconfig/20221128-173704-marostegui.json
  • 17:35 jnuche@deploy1002: Installation of scap version "4.29.2" completed for 558 hosts
  • 17:35 jnuche@deploy1002: Installing scap version "4.29.2" for 558 hosts
  • 17:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P41405 and previous config saved to /var/cache/conftool/dbconfig/20221128-173418-ladsgroup.json
  • 17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41404 and previous config saved to /var/cache/conftool/dbconfig/20221128-173227-ladsgroup.json
  • 17:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 17:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T323827)', diff saved to https://phabricator.wikimedia.org/P41403 and previous config saved to /var/cache/conftool/dbconfig/20221128-173206-ladsgroup.json
  • 17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P41402 and previous config saved to /var/cache/conftool/dbconfig/20221128-172707-ladsgroup.json
  • 17:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 (T323827)', diff saved to https://phabricator.wikimedia.org/P41401 and previous config saved to /var/cache/conftool/dbconfig/20221128-172442-ladsgroup.json
  • 17:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 17:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 17:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T323827)', diff saved to https://phabricator.wikimedia.org/P41400 and previous config saved to /var/cache/conftool/dbconfig/20221128-172419-ladsgroup.json
  • 17:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P41399 and previous config saved to /var/cache/conftool/dbconfig/20221128-172157-marostegui.json
  • 17:21 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 17:20 jbond@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2050.codfw.wmnet with OS bullseye
  • 17:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T323907)', diff saved to https://phabricator.wikimedia.org/P41398 and previous config saved to /var/cache/conftool/dbconfig/20221128-171911-ladsgroup.json
  • 17:17 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
  • 17:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P41397 and previous config saved to /var/cache/conftool/dbconfig/20221128-171659-ladsgroup.json
  • 17:14 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on mc-wf2002.codfw.wmnet with reason: Kernel upgrade
  • 17:14 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on mc-wf2002.codfw.wmnet with reason: Kernel upgrade
  • 17:14 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on mc-wf2001.codfw.wmnet with reason: Kernel upgrade
  • 17:13 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on mc-wf2001.codfw.wmnet with reason: Kernel upgrade
  • 17:13 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
  • 17:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P41396 and previous config saved to /var/cache/conftool/dbconfig/20221128-171200-ladsgroup.json
  • 17:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P41395 and previous config saved to /var/cache/conftool/dbconfig/20221128-170912-ladsgroup.json
  • 17:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T321126)', diff saved to https://phabricator.wikimedia.org/P41394 and previous config saved to /var/cache/conftool/dbconfig/20221128-170651-marostegui.json
  • 17:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2106 (T321126)', diff saved to https://phabricator.wikimedia.org/P41393 and previous config saved to /var/cache/conftool/dbconfig/20221128-170438-marostegui.json
  • 17:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 17:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 17:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 17:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 17:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 17:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 17:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T321126)', diff saved to https://phabricator.wikimedia.org/P41392 and previous config saved to /var/cache/conftool/dbconfig/20221128-170340-marostegui.json
  • 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P41391 and previous config saved to /var/cache/conftool/dbconfig/20221128-170153-ladsgroup.json
  • 16:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41390 and previous config saved to /var/cache/conftool/dbconfig/20221128-165654-ladsgroup.json
  • 16:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 16:55 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P41389 and previous config saved to /var/cache/conftool/dbconfig/20221128-165406-ladsgroup.json
  • 16:53 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 16:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 16:52 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 16:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 16:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P41388 and previous config saved to /var/cache/conftool/dbconfig/20221128-164834-marostegui.json
  • 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T323827)', diff saved to https://phabricator.wikimedia.org/P41387 and previous config saved to /var/cache/conftool/dbconfig/20221128-164646-ladsgroup.json
  • 16:44 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 04m 28s)
  • 16:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 16:40 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 04m 33s)
  • 16:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 16:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T323827)', diff saved to https://phabricator.wikimedia.org/P41386 and previous config saved to /var/cache/conftool/dbconfig/20221128-163859-ladsgroup.json
  • 16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41385 and previous config saved to /var/cache/conftool/dbconfig/20221128-163850-ladsgroup.json
  • 16:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 16:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 16:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 16:34 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 16:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P41384 and previous config saved to /var/cache/conftool/dbconfig/20221128-163328-marostegui.json
  • 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 (T323827)', diff saved to https://phabricator.wikimedia.org/P41383 and previous config saved to /var/cache/conftool/dbconfig/20221128-162945-ladsgroup.json
  • 16:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 16:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41382 and previous config saved to /var/cache/conftool/dbconfig/20221128-162923-ladsgroup.json
  • 16:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T323907)', diff saved to https://phabricator.wikimedia.org/P41381 and previous config saved to /var/cache/conftool/dbconfig/20221128-162815-ladsgroup.json
  • 16:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 16:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T323907)', diff saved to https://phabricator.wikimedia.org/P41380 and previous config saved to /var/cache/conftool/dbconfig/20221128-162753-ladsgroup.json
  • 16:25 jbond@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2050.codfw.wmnet with OS bullseye
  • 16:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 16:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T323827)', diff saved to https://phabricator.wikimedia.org/P41379 and previous config saved to /var/cache/conftool/dbconfig/20221128-162436-ladsgroup.json
  • 16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2156 (T323827)', diff saved to https://phabricator.wikimedia.org/P41378 and previous config saved to /var/cache/conftool/dbconfig/20221128-162246-ladsgroup.json
  • 16:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 16:22 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
  • 16:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 16:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 16:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T323827)', diff saved to https://phabricator.wikimedia.org/P41377 and previous config saved to /var/cache/conftool/dbconfig/20221128-162148-ladsgroup.json
  • 16:19 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
  • 16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T321126)', diff saved to https://phabricator.wikimedia.org/P41376 and previous config saved to /var/cache/conftool/dbconfig/20221128-161820-marostegui.json
  • 16:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1199 (T321126)', diff saved to https://phabricator.wikimedia.org/P41375 and previous config saved to /var/cache/conftool/dbconfig/20221128-161610-marostegui.json
  • 16:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 16:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T321126)', diff saved to https://phabricator.wikimedia.org/P41374 and previous config saved to /var/cache/conftool/dbconfig/20221128-161549-marostegui.json
  • 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P41373 and previous config saved to /var/cache/conftool/dbconfig/20221128-161417-ladsgroup.json
  • 16:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P41372 and previous config saved to /var/cache/conftool/dbconfig/20221128-161247-ladsgroup.json
  • 16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P41371 and previous config saved to /var/cache/conftool/dbconfig/20221128-160929-ladsgroup.json
  • 16:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P41370 and previous config saved to /var/cache/conftool/dbconfig/20221128-160641-ladsgroup.json
  • 16:06 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 16:01 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 16:01 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P41369 and previous config saved to /var/cache/conftool/dbconfig/20221128-160042-marostegui.json
  • 16:00 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 15:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P41368 and previous config saved to /var/cache/conftool/dbconfig/20221128-155910-ladsgroup.json
  • 15:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P41367 and previous config saved to /var/cache/conftool/dbconfig/20221128-155740-ladsgroup.json
  • 15:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P41366 and previous config saved to /var/cache/conftool/dbconfig/20221128-155423-ladsgroup.json
  • 15:53 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 15:52 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 15:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P41365 and previous config saved to /var/cache/conftool/dbconfig/20221128-155135-ladsgroup.json
  • 15:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P41364 and previous config saved to /var/cache/conftool/dbconfig/20221128-154536-marostegui.json
  • 15:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41363 and previous config saved to /var/cache/conftool/dbconfig/20221128-154404-ladsgroup.json
  • 15:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T323907)', diff saved to https://phabricator.wikimedia.org/P41362 and previous config saved to /var/cache/conftool/dbconfig/20221128-154234-ladsgroup.json
  • 15:41 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 15:41 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 15:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T323827)', diff saved to https://phabricator.wikimedia.org/P41361 and previous config saved to /var/cache/conftool/dbconfig/20221128-153916-ladsgroup.json
  • 15:39 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 15:38 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
  • 15:37 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: apply
  • 15:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T323827)', diff saved to https://phabricator.wikimedia.org/P41360 and previous config saved to /var/cache/conftool/dbconfig/20221128-153628-ladsgroup.json
  • 15:34 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=thanos-query,name=eqiad
  • 15:33 godog: revert back to thanos 0.21 - T303154
  • 15:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T321126)', diff saved to https://phabricator.wikimedia.org/P41359 and previous config saved to /var/cache/conftool/dbconfig/20221128-153029-marostegui.json
  • 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T323827)', diff saved to https://phabricator.wikimedia.org/P41358 and previous config saved to /var/cache/conftool/dbconfig/20221128-153016-ladsgroup.json
  • 15:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 15:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 15:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 (T323827)', diff saved to https://phabricator.wikimedia.org/P41357 and previous config saved to /var/cache/conftool/dbconfig/20221128-152955-ladsgroup.json
  • 15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1190 (T321126)', diff saved to https://phabricator.wikimedia.org/P41356 and previous config saved to /var/cache/conftool/dbconfig/20221128-152820-marostegui.json
  • 15:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 15:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T321126)', diff saved to https://phabricator.wikimedia.org/P41355 and previous config saved to /var/cache/conftool/dbconfig/20221128-152758-marostegui.json
  • 15:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41354 and previous config saved to /var/cache/conftool/dbconfig/20221128-152631-ladsgroup.json
  • 15:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 15:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 15:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T323827)', diff saved to https://phabricator.wikimedia.org/P41353 and previous config saved to /var/cache/conftool/dbconfig/20221128-152609-ladsgroup.json
  • 15:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P41352 and previous config saved to /var/cache/conftool/dbconfig/20221128-151448-ladsgroup.json
  • 15:13 jbond@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2050.codfw.wmnet with OS bullseye
  • 15:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P41351 and previous config saved to /var/cache/conftool/dbconfig/20221128-151252-marostegui.json
  • 15:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P41350 and previous config saved to /var/cache/conftool/dbconfig/20221128-151103-ladsgroup.json
  • 15:07 btullis@cumin1001: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T323907)', diff saved to https://phabricator.wikimedia.org/P41349 and previous config saved to /var/cache/conftool/dbconfig/20221128-150654-ladsgroup.json
  • 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 (T323827)', diff saved to https://phabricator.wikimedia.org/P41348 and previous config saved to /var/cache/conftool/dbconfig/20221128-150643-ladsgroup.json
  • 15:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 15:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 15:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T323907)', diff saved to https://phabricator.wikimedia.org/P41347 and previous config saved to /var/cache/conftool/dbconfig/20221128-150626-ladsgroup.json
  • 15:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 14:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P41346 and previous config saved to /var/cache/conftool/dbconfig/20221128-145942-ladsgroup.json
  • 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P41345 and previous config saved to /var/cache/conftool/dbconfig/20221128-145745-marostegui.json
  • 14:57 btullis@cumin1001: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P41344 and previous config saved to /var/cache/conftool/dbconfig/20221128-145556-ladsgroup.json
  • 14:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P41343 and previous config saved to /var/cache/conftool/dbconfig/20221128-145120-ladsgroup.json
  • 14:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 14:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 14:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 (T323827)', diff saved to https://phabricator.wikimedia.org/P41342 and previous config saved to /var/cache/conftool/dbconfig/20221128-144435-ladsgroup.json
  • 14:42 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T321126)', diff saved to https://phabricator.wikimedia.org/P41341 and previous config saved to /var/cache/conftool/dbconfig/20221128-144239-marostegui.json
  • 14:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 14:41 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ printf 'https://en.wikipedia.org/static/images/project-logos/trwikimedia%s.png\n' '-1.5x' '-2x' | mwscript purgeList.php # T323850
  • 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T323827)', diff saved to https://phabricator.wikimedia.org/P41340 and previous config saved to /var/cache/conftool/dbconfig/20221128-144050-ladsgroup.json
  • 14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T321126)', diff saved to https://phabricator.wikimedia.org/P41339 and previous config saved to /var/cache/conftool/dbconfig/20221128-144029-marostegui.json
  • 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 14:39 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for trwikimedia: Update logo (T323850) (duration: 05m 24s)
  • 14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T321126)', diff saved to https://phabricator.wikimedia.org/P41338 and previous config saved to /var/cache/conftool/dbconfig/20221128-143952-marostegui.json
  • 14:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 14:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T323827)', diff saved to https://phabricator.wikimedia.org/P41337 and previous config saved to /var/cache/conftool/dbconfig/20221128-143908-ladsgroup.json
  • 14:37 btullis@cumin1001: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 14:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 14:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P41336 and previous config saved to /var/cache/conftool/dbconfig/20221128-143613-ladsgroup.json
  • 14:35 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde and stang: Backport for trwikimedia: Update logo (T323850) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 14:35 moritzm: rebalance Ganeti group D/eqiad T311687
  • 14:34 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for trwikimedia: Update logo (T323850)
  • 14:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 14:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 14:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T323827)', diff saved to https://phabricator.wikimedia.org/P41335 and previous config saved to /var/cache/conftool/dbconfig/20221128-143231-ladsgroup.json
  • 14:32 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for wikidatawiki: Add ne language logo variant (T323734) (duration: 05m 52s)
  • 14:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 14:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 14:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 14:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 14:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T323827)', diff saved to https://phabricator.wikimedia.org/P41334 and previous config saved to /var/cache/conftool/dbconfig/20221128-143154-ladsgroup.json
  • 14:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 14:27 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde and stang: Backport for wikidatawiki: Add ne language logo variant (T323734) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 14:26 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for wikidatawiki: Add ne language logo variant (T323734)
  • 14:26 btullis@cumin1001: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 14:25 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 14:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P41333 and previous config saved to /var/cache/conftool/dbconfig/20221128-142446-marostegui.json
  • 14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P41332 and previous config saved to /var/cache/conftool/dbconfig/20221128-142402-ladsgroup.json
  • 14:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T323907)', diff saved to https://phabricator.wikimedia.org/P41331 and previous config saved to /var/cache/conftool/dbconfig/20221128-142107-ladsgroup.json
  • 14:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P41330 and previous config saved to /var/cache/conftool/dbconfig/20221128-141648-ladsgroup.json
  • 14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 (T323907)', diff saved to https://phabricator.wikimedia.org/P41329 and previous config saved to /var/cache/conftool/dbconfig/20221128-141016-ladsgroup.json
  • 14:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 14:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P41328 and previous config saved to /var/cache/conftool/dbconfig/20221128-140939-marostegui.json
  • 14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P41327 and previous config saved to /var/cache/conftool/dbconfig/20221128-140855-ladsgroup.json
  • 14:06 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2050.codfw.wmnet with OS bullseye
  • 14:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P41326 and previous config saved to /var/cache/conftool/dbconfig/20221128-140141-ladsgroup.json
  • 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T321126)', diff saved to https://phabricator.wikimedia.org/P41325 and previous config saved to /var/cache/conftool/dbconfig/20221128-135433-marostegui.json
  • 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T323827)', diff saved to https://phabricator.wikimedia.org/P41324 and previous config saved to /var/cache/conftool/dbconfig/20221128-135349-ladsgroup.json
  • 13:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T321126)', diff saved to https://phabricator.wikimedia.org/P41323 and previous config saved to /var/cache/conftool/dbconfig/20221128-135223-marostegui.json
  • 13:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T321126)', diff saved to https://phabricator.wikimedia.org/P41322 and previous config saved to /var/cache/conftool/dbconfig/20221128-135202-marostegui.json
  • 13:51 moritzm: rebalance Ganeti group C/eqiad T311687
  • 13:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 13:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T323907)', diff saved to https://phabricator.wikimedia.org/P41321 and previous config saved to /var/cache/conftool/dbconfig/20221128-135002-ladsgroup.json
  • 13:49 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
  • 13:47 godog: restart grafana-server on grafana1002
  • 13:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T323827)', diff saved to https://phabricator.wikimedia.org/P41320 and previous config saved to /var/cache/conftool/dbconfig/20221128-134635-ladsgroup.json
  • 13:45 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
  • 13:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P41319 and previous config saved to /var/cache/conftool/dbconfig/20221128-133655-marostegui.json
  • 13:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 (T323827)', diff saved to https://phabricator.wikimedia.org/P41318 and previous config saved to /var/cache/conftool/dbconfig/20221128-133648-ladsgroup.json
  • 13:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 13:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 13:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41317 and previous config saved to /var/cache/conftool/dbconfig/20221128-133615-ladsgroup.json
  • 13:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P41316 and previous config saved to /var/cache/conftool/dbconfig/20221128-133456-ladsgroup.json
  • 13:32 filippo@cumin1001: conftool action : set/pooled=false; selector: dnsdisc=thanos-query,name=eqiad
  • 13:27 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 13:27 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 13:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 (T323827)', diff saved to https://phabricator.wikimedia.org/P41315 and previous config saved to /var/cache/conftool/dbconfig/20221128-132706-ladsgroup.json
  • 13:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 13:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 13:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T323827)', diff saved to https://phabricator.wikimedia.org/P41314 and previous config saved to /var/cache/conftool/dbconfig/20221128-132645-ladsgroup.json
  • 13:24 godog: upgrade thanos on prometheus2* - T303154
  • 13:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 (T323827)', diff saved to https://phabricator.wikimedia.org/P41313 and previous config saved to /var/cache/conftool/dbconfig/20221128-132415-ladsgroup.json
  • 13:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 13:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 13:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T323827)', diff saved to https://phabricator.wikimedia.org/P41312 and previous config saved to /var/cache/conftool/dbconfig/20221128-132404-ladsgroup.json
  • 13:21 godog: upgrade thanos on thanos-fe2* - T303154
  • 13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P41311 and previous config saved to /var/cache/conftool/dbconfig/20221128-132149-marostegui.json
  • 13:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P41310 and previous config saved to /var/cache/conftool/dbconfig/20221128-132109-ladsgroup.json
  • 13:20 moritzm: rebalance Ganeti group B/codfw following reboots
  • 13:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P41309 and previous config saved to /var/cache/conftool/dbconfig/20221128-131949-ladsgroup.json
  • 13:18 godog: upgrade thanos on thanos-fe2001 - T303154
  • 13:16 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 13:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P41308 and previous config saved to /var/cache/conftool/dbconfig/20221128-131138-ladsgroup.json
  • 13:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P41307 and previous config saved to /var/cache/conftool/dbconfig/20221128-130858-ladsgroup.json
  • 13:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T321126)', diff saved to https://phabricator.wikimedia.org/P41306 and previous config saved to /var/cache/conftool/dbconfig/20221128-130642-marostegui.json
  • 13:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P41305 and previous config saved to /var/cache/conftool/dbconfig/20221128-130603-ladsgroup.json
  • 13:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T323907)', diff saved to https://phabricator.wikimedia.org/P41304 and previous config saved to /var/cache/conftool/dbconfig/20221128-130443-ladsgroup.json
  • 12:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P41303 and previous config saved to /var/cache/conftool/dbconfig/20221128-125632-ladsgroup.json
  • 12:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 12:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1148.eqiad.wmnet with reason: Maintenance
  • 12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T321126)', diff saved to https://phabricator.wikimedia.org/P41302 and previous config saved to /var/cache/conftool/dbconfig/20221128-125612-marostegui.json
  • 12:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P41301 and previous config saved to /var/cache/conftool/dbconfig/20221128-125351-ladsgroup.json
  • 12:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T323907)', diff saved to https://phabricator.wikimedia.org/P41300 and previous config saved to /var/cache/conftool/dbconfig/20221128-125200-ladsgroup.json
  • 12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 12:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 12:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 12:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41299 and previous config saved to /var/cache/conftool/dbconfig/20221128-125056-ladsgroup.json
  • 12:47 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
  • 12:46 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/termbox: apply
  • 12:45 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/termbox: apply
  • 12:44 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/termbox: apply
  • 12:44 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/termbox: apply
  • 12:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T323827)', diff saved to https://phabricator.wikimedia.org/P41298 and previous config saved to /var/cache/conftool/dbconfig/20221128-124125-ladsgroup.json
  • 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P41297 and previous config saved to /var/cache/conftool/dbconfig/20221128-124105-marostegui.json
  • 12:40 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/termbox: apply
  • 12:38 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/similar-users: apply
  • 12:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T323827)', diff saved to https://phabricator.wikimedia.org/P41296 and previous config saved to /var/cache/conftool/dbconfig/20221128-123845-ladsgroup.json
  • 12:37 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/similar-users: apply
  • 12:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2104 (T323827)', diff saved to https://phabricator.wikimedia.org/P41295 and previous config saved to /var/cache/conftool/dbconfig/20221128-123317-ladsgroup.json
  • 12:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repool db2109', diff saved to https://phabricator.wikimedia.org/P41294 and previous config saved to /var/cache/conftool/dbconfig/20221128-123312-ladsgroup.json
  • 12:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41293 and previous config saved to /var/cache/conftool/dbconfig/20221128-123251-ladsgroup.json
  • 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 12:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 (T323907)', diff saved to https://phabricator.wikimedia.org/P41292 and previous config saved to /var/cache/conftool/dbconfig/20221128-123206-ladsgroup.json
  • 12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 12:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 12:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 12:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P41291 and previous config saved to /var/cache/conftool/dbconfig/20221128-122559-marostegui.json
  • 12:22 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/similar-users: apply
  • 12:22 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
  • 12:21 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
  • 12:20 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/similar-users: apply
  • 12:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 12:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 12:18 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/similar-users: apply
  • 12:18 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/similar-users: apply
  • 12:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 12:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T321126)', diff saved to https://phabricator.wikimedia.org/P41290 and previous config saved to /var/cache/conftool/dbconfig/20221128-121052-marostegui.json
  • 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T321126)', diff saved to https://phabricator.wikimedia.org/P41289 and previous config saved to /var/cache/conftool/dbconfig/20221128-120843-marostegui.json
  • 12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1147.eqiad.wmnet with reason: Maintenance
  • 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T321126)', diff saved to https://phabricator.wikimedia.org/P41288 and previous config saved to /var/cache/conftool/dbconfig/20221128-120822-marostegui.json
  • 12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 (T323827)', diff saved to https://phabricator.wikimedia.org/P41287 and previous config saved to /var/cache/conftool/dbconfig/20221128-120727-ladsgroup.json
  • 12:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 12:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P41286 and previous config saved to /var/cache/conftool/dbconfig/20221128-115316-marostegui.json
  • 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P41285 and previous config saved to /var/cache/conftool/dbconfig/20221128-113809-marostegui.json
  • 11:30 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1043.eqiad.wmnet with OS bullseye
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T321126)', diff saved to https://phabricator.wikimedia.org/P41284 and previous config saved to /var/cache/conftool/dbconfig/20221128-112302-marostegui.json
  • 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T321126)', diff saved to https://phabricator.wikimedia.org/P41283 and previous config saved to /var/cache/conftool/dbconfig/20221128-112053-marostegui.json
  • 11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T321126)', diff saved to https://phabricator.wikimedia.org/P41282 and previous config saved to /var/cache/conftool/dbconfig/20221128-112003-marostegui.json
  • 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2032.codfw.wmnet to cluster codfw and group B
  • 11:05 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1043.eqiad.wmnet with reason: host reimage
  • 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P41281 and previous config saved to /var/cache/conftool/dbconfig/20221128-110456-marostegui.json
  • 11:02 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1043.eqiad.wmnet with reason: host reimage
  • 10:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P41280 and previous config saved to /var/cache/conftool/dbconfig/20221128-104950-marostegui.json
  • 10:48 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1043.eqiad.wmnet with OS bullseye
  • 10:48 aborrero@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirt1043.eqiad.wmnet with OS bullseye
  • 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T321126)', diff saved to https://phabricator.wikimedia.org/P41279 and previous config saved to /var/cache/conftool/dbconfig/20221128-103444-marostegui.json
  • 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T321126)', diff saved to https://phabricator.wikimedia.org/P41278 and previous config saved to /var/cache/conftool/dbconfig/20221128-103234-marostegui.json
  • 10:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 10:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T321126)', diff saved to https://phabricator.wikimedia.org/P41277 and previous config saved to /var/cache/conftool/dbconfig/20221128-103213-marostegui.json
  • 10:31 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1043.eqiad.wmnet with OS bullseye
  • 10:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P41276 and previous config saved to /var/cache/conftool/dbconfig/20221128-101706-marostegui.json
  • 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P41275 and previous config saved to /var/cache/conftool/dbconfig/20221128-100200-marostegui.json
  • 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T321126)', diff saved to https://phabricator.wikimedia.org/P41274 and previous config saved to /var/cache/conftool/dbconfig/20221128-094654-marostegui.json
  • 09:12 moritzm: rebalance Ganeti group A/eqiad T311687
  • 09:08 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2032.codfw.wmnet to cluster codfw and group B
  • 08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T321126)', diff saved to https://phabricator.wikimedia.org/P41273 and previous config saved to /var/cache/conftool/dbconfig/20221128-084637-marostegui.json
  • 08:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 08:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1143.eqiad.wmnet with reason: Maintenance
  • 08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T321126)', diff saved to https://phabricator.wikimedia.org/P41272 and previous config saved to /var/cache/conftool/dbconfig/20221128-084616-marostegui.json
  • 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
  • 08:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 08:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
  • 08:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 08:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 08:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P41271 and previous config saved to /var/cache/conftool/dbconfig/20221128-083110-marostegui.json
  • 08:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 08:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 08:25 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 08:24 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 08:22 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 08:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 08:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 08:21 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 08:21 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 08:21 kartik@deploy1002: Finished scap: Backport for Revert "Content Translation: Reverse MT threshold for Japanese Wikipedia" (duration: 11m 12s)
  • 08:21 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 08:19 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
  • 08:19 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
  • 08:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 08:16 kartik@deploy1002: kartik and trainbranchbot: Backport for Revert "Content Translation: Reverse MT threshold for Japanese Wikipedia" synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
  • 08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P41270 and previous config saved to /var/cache/conftool/dbconfig/20221128-081603-marostegui.json
  • 08:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 08:12 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
  • 08:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 08:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 08:11 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
  • 08:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 08:10 kartik@deploy1002: Started scap: Backport for Revert "Content Translation: Reverse MT threshold for Japanese Wikipedia"
  • 08:09 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
  • 08:09 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
  • 08:07 kartik@deploy1002: Backport cancelled.
  • 08:04 moritzm: rebalance Ganeti group C/codfw following reboots
  • 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T321126)', diff saved to https://phabricator.wikimedia.org/P41269 and previous config saved to /var/cache/conftool/dbconfig/20221128-080057-marostegui.json
  • 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T321126)', diff saved to https://phabricator.wikimedia.org/P41268 and previous config saved to /var/cache/conftool/dbconfig/20221128-075847-marostegui.json
  • 07:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 07:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T321126)', diff saved to https://phabricator.wikimedia.org/P41267 and previous config saved to /var/cache/conftool/dbconfig/20221128-075826-marostegui.json
  • 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P41266 and previous config saved to /var/cache/conftool/dbconfig/20221128-074319-marostegui.json
  • 07:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P41265 and previous config saved to /var/cache/conftool/dbconfig/20221128-072813-marostegui.json
  • 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T321126)', diff saved to https://phabricator.wikimedia.org/P41264 and previous config saved to /var/cache/conftool/dbconfig/20221128-071306-marostegui.json
  • 07:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T321126)', diff saved to https://phabricator.wikimedia.org/P41263 and previous config saved to /var/cache/conftool/dbconfig/20221128-071057-marostegui.json
  • 07:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 07:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 07:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T321126)', diff saved to https://phabricator.wikimedia.org/P41262 and previous config saved to /var/cache/conftool/dbconfig/20221128-071035-marostegui.json
  • 06:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P41261 and previous config saved to /var/cache/conftool/dbconfig/20221128-065529-marostegui.json
  • 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P41260 and previous config saved to /var/cache/conftool/dbconfig/20221128-064022-marostegui.json
  • 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T321126)', diff saved to https://phabricator.wikimedia.org/P41259 and previous config saved to /var/cache/conftool/dbconfig/20221128-062516-marostegui.json
  • 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T321126)', diff saved to https://phabricator.wikimedia.org/P41258 and previous config saved to /var/cache/conftool/dbconfig/20221128-062008-marostegui.json
  • 06:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 06:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 06:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 06:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 06:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 06:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1123.eqiad.wmnet with reason: Maintenance
  • 05:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 05:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2127.codfw.wmnet with reason: Maintenance

2022-11-27

  • 03:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Maint', diff saved to https://phabricator.wikimedia.org/P41257 and previous config saved to /var/cache/conftool/dbconfig/20221127-030126-ladsgroup.json
  • 02:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: Maint', diff saved to https://phabricator.wikimedia.org/P41256 and previous config saved to /var/cache/conftool/dbconfig/20221127-024621-ladsgroup.json
  • 02:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: Maint', diff saved to https://phabricator.wikimedia.org/P41255 and previous config saved to /var/cache/conftool/dbconfig/20221127-023116-ladsgroup.json
  • 02:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: Maint', diff saved to https://phabricator.wikimedia.org/P41254 and previous config saved to /var/cache/conftool/dbconfig/20221127-021611-ladsgroup.json

2022-11-26

  • 21:34 urandom: initiating Cassandra bootstrap, aqs1021-b -- T307802
  • 09:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 09:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 09:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 09:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 02:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 (T323827)', diff saved to https://phabricator.wikimedia.org/P41253 and previous config saved to /var/cache/conftool/dbconfig/20221126-023900-ladsgroup.json
  • 02:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 02:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 02:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 02:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 02:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T323827)', diff saved to https://phabricator.wikimedia.org/P41252 and previous config saved to /var/cache/conftool/dbconfig/20221126-023702-ladsgroup.json
  • 02:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P41251 and previous config saved to /var/cache/conftool/dbconfig/20221126-022156-ladsgroup.json
  • 02:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P41250 and previous config saved to /var/cache/conftool/dbconfig/20221126-020649-ladsgroup.json
  • 01:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T323827)', diff saved to https://phabricator.wikimedia.org/P41249 and previous config saved to /var/cache/conftool/dbconfig/20221126-015143-ladsgroup.json
  • 01:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 01:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 01:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T323827)', diff saved to https://phabricator.wikimedia.org/P41248 and previous config saved to /var/cache/conftool/dbconfig/20221126-013423-ladsgroup.json
  • 01:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 (T323827)', diff saved to https://phabricator.wikimedia.org/P41247 and previous config saved to /var/cache/conftool/dbconfig/20221126-013225-ladsgroup.json
  • 01:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 01:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 01:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T323827)', diff saved to https://phabricator.wikimedia.org/P41246 and previous config saved to /var/cache/conftool/dbconfig/20221126-013153-ladsgroup.json
  • 01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41245 and previous config saved to /var/cache/conftool/dbconfig/20221126-011917-ladsgroup.json
  • 01:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41244 and previous config saved to /var/cache/conftool/dbconfig/20221126-011647-ladsgroup.json
  • 01:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41243 and previous config saved to /var/cache/conftool/dbconfig/20221126-010411-ladsgroup.json
  • 01:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41242 and previous config saved to /var/cache/conftool/dbconfig/20221126-010140-ladsgroup.json
  • 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T323827)', diff saved to https://phabricator.wikimedia.org/P41241 and previous config saved to /var/cache/conftool/dbconfig/20221126-004904-ladsgroup.json
  • 00:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T323827)', diff saved to https://phabricator.wikimedia.org/P41240 and previous config saved to /var/cache/conftool/dbconfig/20221126-004634-ladsgroup.json
  • 00:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T323827)', diff saved to https://phabricator.wikimedia.org/P41239 and previous config saved to /var/cache/conftool/dbconfig/20221126-004437-ladsgroup.json
  • 00:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 (T323827)', diff saved to https://phabricator.wikimedia.org/P41238 and previous config saved to /var/cache/conftool/dbconfig/20221126-003417-ladsgroup.json
  • 00:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 00:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T323827)', diff saved to https://phabricator.wikimedia.org/P41237 and previous config saved to /var/cache/conftool/dbconfig/20221126-003356-ladsgroup.json
  • 00:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 (T323827)', diff saved to https://phabricator.wikimedia.org/P41236 and previous config saved to /var/cache/conftool/dbconfig/20221126-003009-ladsgroup.json
  • 00:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 00:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 00:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T323827)', diff saved to https://phabricator.wikimedia.org/P41235 and previous config saved to /var/cache/conftool/dbconfig/20221126-002948-ladsgroup.json
  • 00:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P41234 and previous config saved to /var/cache/conftool/dbconfig/20221126-002932-ladsgroup.json
  • 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41233 and previous config saved to /var/cache/conftool/dbconfig/20221126-001849-ladsgroup.json
  • 00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P41232 and previous config saved to /var/cache/conftool/dbconfig/20221126-001441-ladsgroup.json
  • 00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P41231 and previous config saved to /var/cache/conftool/dbconfig/20221126-001425-ladsgroup.json
  • 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41230 and previous config saved to /var/cache/conftool/dbconfig/20221126-000343-ladsgroup.json

2022-11-25

  • 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P41229 and previous config saved to /var/cache/conftool/dbconfig/20221125-235935-ladsgroup.json
  • 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T323827)', diff saved to https://phabricator.wikimedia.org/P41228 and previous config saved to /var/cache/conftool/dbconfig/20221125-235919-ladsgroup.json
  • 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T323827)', diff saved to https://phabricator.wikimedia.org/P41227 and previous config saved to /var/cache/conftool/dbconfig/20221125-234836-ladsgroup.json
  • 23:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T323827)', diff saved to https://phabricator.wikimedia.org/P41226 and previous config saved to /var/cache/conftool/dbconfig/20221125-234428-ladsgroup.json
  • 23:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T323827)', diff saved to https://phabricator.wikimedia.org/P41225 and previous config saved to /var/cache/conftool/dbconfig/20221125-234305-ladsgroup.json
  • 23:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 23:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 23:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 23:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 23:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T323827)', diff saved to https://phabricator.wikimedia.org/P41224 and previous config saved to /var/cache/conftool/dbconfig/20221125-233002-ladsgroup.json
  • 23:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P41223 and previous config saved to /var/cache/conftool/dbconfig/20221125-231456-ladsgroup.json
  • 23:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T323827)', diff saved to https://phabricator.wikimedia.org/P41222 and previous config saved to /var/cache/conftool/dbconfig/20221125-230518-ladsgroup.json
  • 23:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 23:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 23:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41221 and previous config saved to /var/cache/conftool/dbconfig/20221125-230457-ladsgroup.json
  • 23:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 (T323827)', diff saved to https://phabricator.wikimedia.org/P41220 and previous config saved to /var/cache/conftool/dbconfig/20221125-230143-ladsgroup.json
  • 23:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 23:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 23:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T323827)', diff saved to https://phabricator.wikimedia.org/P41219 and previous config saved to /var/cache/conftool/dbconfig/20221125-230122-ladsgroup.json
  • 22:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P41218 and previous config saved to /var/cache/conftool/dbconfig/20221125-225949-ladsgroup.json
  • 22:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P41217 and previous config saved to /var/cache/conftool/dbconfig/20221125-224951-ladsgroup.json
  • 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P41216 and previous config saved to /var/cache/conftool/dbconfig/20221125-224615-ladsgroup.json
  • 22:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T323827)', diff saved to https://phabricator.wikimedia.org/P41215 and previous config saved to /var/cache/conftool/dbconfig/20221125-224443-ladsgroup.json
  • 22:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P41214 and previous config saved to /var/cache/conftool/dbconfig/20221125-223444-ladsgroup.json
  • 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P41213 and previous config saved to /var/cache/conftool/dbconfig/20221125-223109-ladsgroup.json
  • 22:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41212 and previous config saved to /var/cache/conftool/dbconfig/20221125-221938-ladsgroup.json
  • 22:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T323827)', diff saved to https://phabricator.wikimedia.org/P41211 and previous config saved to /var/cache/conftool/dbconfig/20221125-221602-ladsgroup.json
  • 22:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T323827)', diff saved to https://phabricator.wikimedia.org/P41210 and previous config saved to /var/cache/conftool/dbconfig/20221125-221218-ladsgroup.json
  • 22:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 22:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 22:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T323827)', diff saved to https://phabricator.wikimedia.org/P41209 and previous config saved to /var/cache/conftool/dbconfig/20221125-221157-ladsgroup.json
  • 22:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 (T323827)', diff saved to https://phabricator.wikimedia.org/P41208 and previous config saved to /var/cache/conftool/dbconfig/20221125-220602-ladsgroup.json
  • 22:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 22:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 22:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41207 and previous config saved to /var/cache/conftool/dbconfig/20221125-220541-ladsgroup.json
  • 21:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P41206 and previous config saved to /var/cache/conftool/dbconfig/20221125-215651-ladsgroup.json
  • 21:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P41205 and previous config saved to /var/cache/conftool/dbconfig/20221125-215034-ladsgroup.json
  • 21:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P41204 and previous config saved to /var/cache/conftool/dbconfig/20221125-214144-ladsgroup.json
  • 21:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41203 and previous config saved to /var/cache/conftool/dbconfig/20221125-214038-ladsgroup.json
  • 21:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 21:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 21:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T323827)', diff saved to https://phabricator.wikimedia.org/P41202 and previous config saved to /var/cache/conftool/dbconfig/20221125-214016-ladsgroup.json
  • 21:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P41201 and previous config saved to /var/cache/conftool/dbconfig/20221125-213527-ladsgroup.json
  • 21:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T323827)', diff saved to https://phabricator.wikimedia.org/P41200 and previous config saved to /var/cache/conftool/dbconfig/20221125-212638-ladsgroup.json
  • 21:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P41199 and previous config saved to /var/cache/conftool/dbconfig/20221125-212510-ladsgroup.json
  • 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41198 and previous config saved to /var/cache/conftool/dbconfig/20221125-212020-ladsgroup.json
  • 21:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T323827)', diff saved to https://phabricator.wikimedia.org/P41197 and previous config saved to /var/cache/conftool/dbconfig/20221125-211137-ladsgroup.json
  • 21:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 21:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 21:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T323827)', diff saved to https://phabricator.wikimedia.org/P41196 and previous config saved to /var/cache/conftool/dbconfig/20221125-211116-ladsgroup.json
  • 21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P41195 and previous config saved to /var/cache/conftool/dbconfig/20221125-211003-ladsgroup.json
  • 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P41194 and previous config saved to /var/cache/conftool/dbconfig/20221125-205609-ladsgroup.json
  • 20:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T323827)', diff saved to https://phabricator.wikimedia.org/P41193 and previous config saved to /var/cache/conftool/dbconfig/20221125-205457-ladsgroup.json
  • 20:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41192 and previous config saved to /var/cache/conftool/dbconfig/20221125-204244-ladsgroup.json
  • 20:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 20:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 20:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T323827)', diff saved to https://phabricator.wikimedia.org/P41191 and previous config saved to /var/cache/conftool/dbconfig/20221125-204211-ladsgroup.json
  • 20:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P41190 and previous config saved to /var/cache/conftool/dbconfig/20221125-204103-ladsgroup.json
  • 20:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P41189 and previous config saved to /var/cache/conftool/dbconfig/20221125-202705-ladsgroup.json
  • 20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T323827)', diff saved to https://phabricator.wikimedia.org/P41188 and previous config saved to /var/cache/conftool/dbconfig/20221125-202557-ladsgroup.json
  • 20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T323827)', diff saved to https://phabricator.wikimedia.org/P41187 and previous config saved to /var/cache/conftool/dbconfig/20221125-201754-ladsgroup.json
  • 20:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 20:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 20:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 20:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41186 and previous config saved to /var/cache/conftool/dbconfig/20221125-201705-ladsgroup.json
  • 20:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P41185 and previous config saved to /var/cache/conftool/dbconfig/20221125-201158-ladsgroup.json
  • 20:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T323827)', diff saved to https://phabricator.wikimedia.org/P41184 and previous config saved to /var/cache/conftool/dbconfig/20221125-201111-ladsgroup.json
  • 20:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 20:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T323827)', diff saved to https://phabricator.wikimedia.org/P41183 and previous config saved to /var/cache/conftool/dbconfig/20221125-201049-ladsgroup.json
  • 20:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P41182 and previous config saved to /var/cache/conftool/dbconfig/20221125-200158-ladsgroup.json
  • 19:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T323827)', diff saved to https://phabricator.wikimedia.org/P41181 and previous config saved to /var/cache/conftool/dbconfig/20221125-195652-ladsgroup.json
  • 19:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P41180 and previous config saved to /var/cache/conftool/dbconfig/20221125-195543-ladsgroup.json
  • 19:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P41179 and previous config saved to /var/cache/conftool/dbconfig/20221125-194652-ladsgroup.json
  • 19:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P41178 and previous config saved to /var/cache/conftool/dbconfig/20221125-194036-ladsgroup.json
  • 19:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T321126)', diff saved to https://phabricator.wikimedia.org/P41177 and previous config saved to /var/cache/conftool/dbconfig/20221125-193503-marostegui.json
  • 19:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41176 and previous config saved to /var/cache/conftool/dbconfig/20221125-193145-ladsgroup.json
  • 19:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T323827)', diff saved to https://phabricator.wikimedia.org/P41175 and previous config saved to /var/cache/conftool/dbconfig/20221125-192530-ladsgroup.json
  • 19:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 (T323827)', diff saved to https://phabricator.wikimedia.org/P41174 and previous config saved to /var/cache/conftool/dbconfig/20221125-192147-ladsgroup.json
  • 19:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 19:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 19:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P41173 and previous config saved to /var/cache/conftool/dbconfig/20221125-191956-marostegui.json
  • 19:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 (T323827)', diff saved to https://phabricator.wikimedia.org/P41172 and previous config saved to /var/cache/conftool/dbconfig/20221125-191937-ladsgroup.json
  • 19:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 19:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 19:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41171 and previous config saved to /var/cache/conftool/dbconfig/20221125-191915-ladsgroup.json
  • 19:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P41170 and previous config saved to /var/cache/conftool/dbconfig/20221125-190450-marostegui.json
  • 19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P41169 and previous config saved to /var/cache/conftool/dbconfig/20221125-190409-ladsgroup.json
  • 18:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 18:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 18:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T323827)', diff saved to https://phabricator.wikimedia.org/P41168 and previous config saved to /var/cache/conftool/dbconfig/20221125-185312-ladsgroup.json
  • 18:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41167 and previous config saved to /var/cache/conftool/dbconfig/20221125-185257-ladsgroup.json
  • 18:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 18:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 18:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T321126)', diff saved to https://phabricator.wikimedia.org/P41166 and previous config saved to /var/cache/conftool/dbconfig/20221125-184943-marostegui.json
  • 18:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P41165 and previous config saved to /var/cache/conftool/dbconfig/20221125-184902-ladsgroup.json
  • 18:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P41164 and previous config saved to /var/cache/conftool/dbconfig/20221125-183806-ladsgroup.json
  • 18:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41163 and previous config saved to /var/cache/conftool/dbconfig/20221125-183356-ladsgroup.json
  • 18:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P41162 and previous config saved to /var/cache/conftool/dbconfig/20221125-182259-ladsgroup.json
  • 18:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2177 (T321126)', diff saved to https://phabricator.wikimedia.org/P41161 and previous config saved to /var/cache/conftool/dbconfig/20221125-182126-marostegui.json
  • 18:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 18:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 18:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T321126)', diff saved to https://phabricator.wikimedia.org/P41160 and previous config saved to /var/cache/conftool/dbconfig/20221125-182105-marostegui.json
  • 18:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 18:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 18:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T323827)', diff saved to https://phabricator.wikimedia.org/P41159 and previous config saved to /var/cache/conftool/dbconfig/20221125-181900-ladsgroup.json
  • 18:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T323827)', diff saved to https://phabricator.wikimedia.org/P41158 and previous config saved to /var/cache/conftool/dbconfig/20221125-180753-ladsgroup.json
  • 18:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P41157 and previous config saved to /var/cache/conftool/dbconfig/20221125-180558-marostegui.json
  • 18:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P41156 and previous config saved to /var/cache/conftool/dbconfig/20221125-180353-ladsgroup.json
  • 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41155 and previous config saved to /var/cache/conftool/dbconfig/20221125-175624-ladsgroup.json
  • 17:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 17:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 17:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T323827)', diff saved to https://phabricator.wikimedia.org/P41154 and previous config saved to /var/cache/conftool/dbconfig/20221125-175551-ladsgroup.json
  • 17:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T323827)', diff saved to https://phabricator.wikimedia.org/P41153 and previous config saved to /var/cache/conftool/dbconfig/20221125-175114-ladsgroup.json
  • 17:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 17:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 17:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P41152 and previous config saved to /var/cache/conftool/dbconfig/20221125-175052-marostegui.json
  • 17:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 17:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 17:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P41151 and previous config saved to /var/cache/conftool/dbconfig/20221125-174847-ladsgroup.json
  • 17:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P41150 and previous config saved to /var/cache/conftool/dbconfig/20221125-174045-ladsgroup.json
  • 17:38 urandom: initiating Cassandra bootstrap, aqs1021-a -- T307802
  • 17:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T321126)', diff saved to https://phabricator.wikimedia.org/P41149 and previous config saved to /var/cache/conftool/dbconfig/20221125-173545-marostegui.json
  • 17:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T323827)', diff saved to https://phabricator.wikimedia.org/P41148 and previous config saved to /var/cache/conftool/dbconfig/20221125-173340-ladsgroup.json
  • 17:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P41147 and previous config saved to /var/cache/conftool/dbconfig/20221125-172538-ladsgroup.json
  • 17:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 17:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 17:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T323827)', diff saved to https://phabricator.wikimedia.org/P41146 and previous config saved to /var/cache/conftool/dbconfig/20221125-171729-ladsgroup.json
  • 17:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 17:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 17:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 (T323827)', diff saved to https://phabricator.wikimedia.org/P41145 and previous config saved to /var/cache/conftool/dbconfig/20221125-171707-ladsgroup.json
  • 17:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T323827)', diff saved to https://phabricator.wikimedia.org/P41144 and previous config saved to /var/cache/conftool/dbconfig/20221125-171032-ladsgroup.json
  • 17:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2156 (T321126)', diff saved to https://phabricator.wikimedia.org/P41143 and previous config saved to /var/cache/conftool/dbconfig/20221125-170859-marostegui.json
  • 17:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 17:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 17:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 17:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T321126)', diff saved to https://phabricator.wikimedia.org/P41142 and previous config saved to /var/cache/conftool/dbconfig/20221125-170811-marostegui.json
  • 17:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P41141 and previous config saved to /var/cache/conftool/dbconfig/20221125-170200-ladsgroup.json
  • 16:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T323827)', diff saved to https://phabricator.wikimedia.org/P41140 and previous config saved to /var/cache/conftool/dbconfig/20221125-165341-ladsgroup.json
  • 16:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 16:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 16:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 16:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 16:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T323827)', diff saved to https://phabricator.wikimedia.org/P41139 and previous config saved to /var/cache/conftool/dbconfig/20221125-165315-ladsgroup.json
  • 16:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P41138 and previous config saved to /var/cache/conftool/dbconfig/20221125-165304-marostegui.json
  • 16:49 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@f6b8a0a]: (no justification provided) (duration: 00m 18s)
  • 16:49 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@f6b8a0a]: (no justification provided)
  • 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P41137 and previous config saved to /var/cache/conftool/dbconfig/20221125-164654-ladsgroup.json
  • 16:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P41136 and previous config saved to /var/cache/conftool/dbconfig/20221125-163808-ladsgroup.json
  • 16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P41135 and previous config saved to /var/cache/conftool/dbconfig/20221125-163758-marostegui.json
  • 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 (T323827)', diff saved to https://phabricator.wikimedia.org/P41134 and previous config saved to /var/cache/conftool/dbconfig/20221125-163147-ladsgroup.json
  • off: restarted turnilo on an-tool1007
  • 16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P41133 and previous config saved to /var/cache/conftool/dbconfig/20221125-162302-ladsgroup.json
  • 16:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T321126)', diff saved to https://phabricator.wikimedia.org/P41132 and previous config saved to /var/cache/conftool/dbconfig/20221125-162251-marostegui.json
  • 16:11 _joe_: upgraded vopsbot to 0.3.2
  • 16:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T323827)', diff saved to https://phabricator.wikimedia.org/P41131 and previous config saved to /var/cache/conftool/dbconfig/20221125-160755-ladsgroup.json
  • 15:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2149 (T321126)', diff saved to https://phabricator.wikimedia.org/P41130 and previous config saved to /var/cache/conftool/dbconfig/20221125-155447-marostegui.json
  • 15:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 15:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 (T323827)', diff saved to https://phabricator.wikimedia.org/P41129 and previous config saved to /var/cache/conftool/dbconfig/20221125-155300-ladsgroup.json
  • 15:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 15:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 15:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41128 and previous config saved to /var/cache/conftool/dbconfig/20221125-155238-ladsgroup.json
  • 15:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P41127 and previous config saved to /var/cache/conftool/dbconfig/20221125-153732-ladsgroup.json
  • 15:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 15:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T321126)', diff saved to https://phabricator.wikimedia.org/P41126 and previous config saved to /var/cache/conftool/dbconfig/20221125-152810-marostegui.json
  • 15:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 (T323827)', diff saved to https://phabricator.wikimedia.org/P41125 and previous config saved to /var/cache/conftool/dbconfig/20221125-152704-ladsgroup.json
  • 15:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 15:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 15:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T323827)', diff saved to https://phabricator.wikimedia.org/P41124 and previous config saved to /var/cache/conftool/dbconfig/20221125-152642-ladsgroup.json
  • 15:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P41123 and previous config saved to /var/cache/conftool/dbconfig/20221125-152225-ladsgroup.json
  • 15:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P41122 and previous config saved to /var/cache/conftool/dbconfig/20221125-151303-marostegui.json
  • 15:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P41121 and previous config saved to /var/cache/conftool/dbconfig/20221125-151135-ladsgroup.json
  • 15:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41120 and previous config saved to /var/cache/conftool/dbconfig/20221125-150719-ladsgroup.json
  • 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P41119 and previous config saved to /var/cache/conftool/dbconfig/20221125-145757-marostegui.json
  • 14:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P41118 and previous config saved to /var/cache/conftool/dbconfig/20221125-145629-ladsgroup.json
  • 14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T321126)', diff saved to https://phabricator.wikimedia.org/P41117 and previous config saved to /var/cache/conftool/dbconfig/20221125-144251-marostegui.json
  • 14:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T323827)', diff saved to https://phabricator.wikimedia.org/P41116 and previous config saved to /var/cache/conftool/dbconfig/20221125-144123-ladsgroup.json
  • 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T323827)', diff saved to https://phabricator.wikimedia.org/P41115 and previous config saved to /var/cache/conftool/dbconfig/20221125-142525-ladsgroup.json
  • 14:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2104 (T323827)', diff saved to https://phabricator.wikimedia.org/P41114 and previous config saved to /var/cache/conftool/dbconfig/20221125-142506-ladsgroup.json
  • 14:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 14:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 14:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2109 (T321126)', diff saved to https://phabricator.wikimedia.org/P41113 and previous config saved to /var/cache/conftool/dbconfig/20221125-141434-marostegui.json
  • 14:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 14:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T321126)', diff saved to https://phabricator.wikimedia.org/P41112 and previous config saved to /var/cache/conftool/dbconfig/20221125-141412-marostegui.json
  • 13:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P41111 and previous config saved to /var/cache/conftool/dbconfig/20221125-135906-marostegui.json
  • 13:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 13:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 13:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 13:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 13:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P41110 and previous config saved to /var/cache/conftool/dbconfig/20221125-134359-marostegui.json
  • 13:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T321126)', diff saved to https://phabricator.wikimedia.org/P41109 and previous config saved to /var/cache/conftool/dbconfig/20221125-132853-marostegui.json
  • 13:11 gehel: re-enabling puppet on wcqs1001 - data transfer completed - T321605
  • 12:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2105 (T321126)', diff saved to https://phabricator.wikimedia.org/P41108 and previous config saved to /var/cache/conftool/dbconfig/20221125-125935-marostegui.json
  • 12:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 12:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 12:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 12:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T321126)', diff saved to https://phabricator.wikimedia.org/P41107 and previous config saved to /var/cache/conftool/dbconfig/20221125-125046-marostegui.json
  • 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41106 and previous config saved to /var/cache/conftool/dbconfig/20221125-123540-marostegui.json
  • 12:26 moritzm: installing vim security updates
  • 12:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P41105 and previous config saved to /var/cache/conftool/dbconfig/20221125-122033-marostegui.json
  • 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2031.codfw.wmnet to cluster codfw and group B
  • 12:08 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2031.codfw.wmnet to cluster codfw and group B
  • 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T321126)', diff saved to https://phabricator.wikimedia.org/P41104 and previous config saved to /var/cache/conftool/dbconfig/20221125-120527-marostegui.json
  • 11:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1198 (T321126)', diff saved to https://phabricator.wikimedia.org/P41103 and previous config saved to /var/cache/conftool/dbconfig/20221125-115222-marostegui.json
  • 11:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 11:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 11:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T321126)', diff saved to https://phabricator.wikimedia.org/P41102 and previous config saved to /var/cache/conftool/dbconfig/20221125-115201-marostegui.json
  • 11:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2031.codfw.wmnet
  • 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41101 and previous config saved to /var/cache/conftool/dbconfig/20221125-113654-marostegui.json
  • 11:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet
  • 11:24 elukey: restart turnilo on an-tool1007 to pick up new settings for webrequest_sampled_live
  • 11:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P41100 and previous config saved to /var/cache/conftool/dbconfig/20221125-112148-marostegui.json
  • 11:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T321126)', diff saved to https://phabricator.wikimedia.org/P41099 and previous config saved to /var/cache/conftool/dbconfig/20221125-110642-marostegui.json
  • 10:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1189 (T321126)', diff saved to https://phabricator.wikimedia.org/P41098 and previous config saved to /var/cache/conftool/dbconfig/20221125-105036-marostegui.json
  • 10:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 10:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 10:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T321126)', diff saved to https://phabricator.wikimedia.org/P41097 and previous config saved to /var/cache/conftool/dbconfig/20221125-105015-marostegui.json
  • 10:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P41096 and previous config saved to /var/cache/conftool/dbconfig/20221125-103509-marostegui.json
  • 10:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P41095 and previous config saved to /var/cache/conftool/dbconfig/20221125-102002-marostegui.json
  • 10:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T321126)', diff saved to https://phabricator.wikimedia.org/P41094 and previous config saved to /var/cache/conftool/dbconfig/20221125-100456-marostegui.json
  • 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T321126)', diff saved to https://phabricator.wikimedia.org/P41093 and previous config saved to /var/cache/conftool/dbconfig/20221125-094643-marostegui.json
  • 09:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 09:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T321126)', diff saved to https://phabricator.wikimedia.org/P41092 and previous config saved to /var/cache/conftool/dbconfig/20221125-094622-marostegui.json
  • 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P41091 and previous config saved to /var/cache/conftool/dbconfig/20221125-093115-marostegui.json
  • 09:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P41090 and previous config saved to /var/cache/conftool/dbconfig/20221125-091609-marostegui.json
  • 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T321126)', diff saved to https://phabricator.wikimedia.org/P41089 and previous config saved to /var/cache/conftool/dbconfig/20221125-090102-marostegui.json
  • 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T321126)', diff saved to https://phabricator.wikimedia.org/P41088 and previous config saved to /var/cache/conftool/dbconfig/20221125-085101-marostegui.json
  • 08:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 08:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T321126)', diff saved to https://phabricator.wikimedia.org/P41087 and previous config saved to /var/cache/conftool/dbconfig/20221125-085040-marostegui.json
  • 08:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P41086 and previous config saved to /var/cache/conftool/dbconfig/20221125-083534-marostegui.json
  • 08:35 moritzm: installing libarchive security updates
  • 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P41085 and previous config saved to /var/cache/conftool/dbconfig/20221125-082027-marostegui.json
  • 08:09 moritzm: rebalance Ganeti group C/codfw following reboots
  • 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T321126)', diff saved to https://phabricator.wikimedia.org/P41084 and previous config saved to /var/cache/conftool/dbconfig/20221125-080521-marostegui.json
  • 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T321126)', diff saved to https://phabricator.wikimedia.org/P41083 and previous config saved to /var/cache/conftool/dbconfig/20221125-075521-marostegui.json
  • 07:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 07:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T321126)', diff saved to https://phabricator.wikimedia.org/P41082 and previous config saved to /var/cache/conftool/dbconfig/20221125-075500-marostegui.json
  • 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P41081 and previous config saved to /var/cache/conftool/dbconfig/20221125-073953-marostegui.json
  • 07:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P41080 and previous config saved to /var/cache/conftool/dbconfig/20221125-072447-marostegui.json
  • 07:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T321126)', diff saved to https://phabricator.wikimedia.org/P41079 and previous config saved to /var/cache/conftool/dbconfig/20221125-070940-marostegui.json
  • 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1157 (T321126)', diff saved to https://phabricator.wikimedia.org/P41078 and previous config saved to /var/cache/conftool/dbconfig/20221125-065930-marostegui.json
  • 06:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 06:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 06:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 06:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 06:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T321126)', diff saved to https://phabricator.wikimedia.org/P41077 and previous config saved to /var/cache/conftool/dbconfig/20221125-065049-marostegui.json
  • 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P41076 and previous config saved to /var/cache/conftool/dbconfig/20221125-063543-marostegui.json
  • 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P41075 and previous config saved to /var/cache/conftool/dbconfig/20221125-062036-marostegui.json
  • 06:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T321126)', diff saved to https://phabricator.wikimedia.org/P41074 and previous config saved to /var/cache/conftool/dbconfig/20221125-060530-marostegui.json
  • 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T321126)', diff saved to https://phabricator.wikimedia.org/P41073 and previous config saved to /var/cache/conftool/dbconfig/20221125-055517-marostegui.json
  • 05:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 05:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 05:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 05:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 05:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 05:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 05:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1109.eqiad.wmnet with reason: Maintenance
  • 05:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1109.eqiad.wmnet with reason: Maintenance
  • 05:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 05:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 01:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T321126)', diff saved to https://phabricator.wikimedia.org/P41072 and previous config saved to /var/cache/conftool/dbconfig/20221125-013324-marostegui.json
  • 01:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P41071 and previous config saved to /var/cache/conftool/dbconfig/20221125-011818-marostegui.json
  • 01:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P41070 and previous config saved to /var/cache/conftool/dbconfig/20221125-010311-marostegui.json
  • 00:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T322618)', diff saved to https://phabricator.wikimedia.org/P41069 and previous config saved to /var/cache/conftool/dbconfig/20221125-005150-ladsgroup.json
  • 00:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T321126)', diff saved to https://phabricator.wikimedia.org/P41068 and previous config saved to /var/cache/conftool/dbconfig/20221125-004805-marostegui.json
  • 00:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2181 (T321126)', diff saved to https://phabricator.wikimedia.org/P41067 and previous config saved to /var/cache/conftool/dbconfig/20221125-004554-marostegui.json
  • 00:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 00:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 00:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T321126)', diff saved to https://phabricator.wikimedia.org/P41066 and previous config saved to /var/cache/conftool/dbconfig/20221125-004533-marostegui.json
  • 00:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P41065 and previous config saved to /var/cache/conftool/dbconfig/20221125-003643-ladsgroup.json
  • 00:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P41064 and previous config saved to /var/cache/conftool/dbconfig/20221125-003026-marostegui.json
  • 00:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P41063 and previous config saved to /var/cache/conftool/dbconfig/20221125-002137-ladsgroup.json
  • 00:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P41062 and previous config saved to /var/cache/conftool/dbconfig/20221125-002119-ladsgroup.json
  • 00:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P41061 and previous config saved to /var/cache/conftool/dbconfig/20221125-001520-marostegui.json
  • 00:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T322618)', diff saved to https://phabricator.wikimedia.org/P41060 and previous config saved to /var/cache/conftool/dbconfig/20221125-000630-ladsgroup.json
  • 00:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P41059 and previous config saved to /var/cache/conftool/dbconfig/20221125-000614-ladsgroup.json
  • 00:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1194 (T322618)', diff saved to https://phabricator.wikimedia.org/P41058 and previous config saved to /var/cache/conftool/dbconfig/20221125-000421-ladsgroup.json
  • 00:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 00:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T321126)', diff saved to https://phabricator.wikimedia.org/P41057 and previous config saved to /var/cache/conftool/dbconfig/20221125-000013-marostegui.json

2022-11-24

  • 23:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3318 (T321126)', diff saved to https://phabricator.wikimedia.org/P41056 and previous config saved to /var/cache/conftool/dbconfig/20221124-235803-marostegui.json
  • 23:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 23:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 23:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T321126)', diff saved to https://phabricator.wikimedia.org/P41055 and previous config saved to /var/cache/conftool/dbconfig/20221124-235741-marostegui.json
  • 23:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P41054 and previous config saved to /var/cache/conftool/dbconfig/20221124-235109-ladsgroup.json
  • 23:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P41053 and previous config saved to /var/cache/conftool/dbconfig/20221124-234234-marostegui.json
  • 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P41052 and previous config saved to /var/cache/conftool/dbconfig/20221124-233604-ladsgroup.json
  • 23:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 23:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 23:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 23:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 23:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P41051 and previous config saved to /var/cache/conftool/dbconfig/20221124-232728-marostegui.json
  • 23:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 23:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 23:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 23:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 23:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 23:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 23:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T321126)', diff saved to https://phabricator.wikimedia.org/P41050 and previous config saved to /var/cache/conftool/dbconfig/20221124-231221-marostegui.json
  • 23:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3318 (T321126)', diff saved to https://phabricator.wikimedia.org/P41049 and previous config saved to /var/cache/conftool/dbconfig/20221124-231011-marostegui.json
  • 23:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 23:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 23:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T321126)', diff saved to https://phabricator.wikimedia.org/P41048 and previous config saved to /var/cache/conftool/dbconfig/20221124-230949-marostegui.json
  • 22:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P41047 and previous config saved to /var/cache/conftool/dbconfig/20221124-225443-marostegui.json
  • 22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P41046 and previous config saved to /var/cache/conftool/dbconfig/20221124-223937-marostegui.json
  • 22:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T321126)', diff saved to https://phabricator.wikimedia.org/P41045 and previous config saved to /var/cache/conftool/dbconfig/20221124-222430-marostegui.json
  • 22:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2166 (T321126)', diff saved to https://phabricator.wikimedia.org/P41044 and previous config saved to /var/cache/conftool/dbconfig/20221124-222220-marostegui.json
  • 22:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 22:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 22:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T321126)', diff saved to https://phabricator.wikimedia.org/P41043 and previous config saved to /var/cache/conftool/dbconfig/20221124-222158-marostegui.json
  • 22:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P41042 and previous config saved to /var/cache/conftool/dbconfig/20221124-220652-marostegui.json
  • 21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P41041 and previous config saved to /var/cache/conftool/dbconfig/20221124-215145-marostegui.json
  • 21:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T321126)', diff saved to https://phabricator.wikimedia.org/P41040 and previous config saved to /var/cache/conftool/dbconfig/20221124-213639-marostegui.json
  • 21:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2164 (T321126)', diff saved to https://phabricator.wikimedia.org/P41039 and previous config saved to /var/cache/conftool/dbconfig/20221124-213428-marostegui.json
  • 21:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 21:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 21:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 21:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 21:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T321126)', diff saved to https://phabricator.wikimedia.org/P41038 and previous config saved to /var/cache/conftool/dbconfig/20221124-213351-marostegui.json
  • 21:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P41037 and previous config saved to /var/cache/conftool/dbconfig/20221124-211845-marostegui.json
  • 21:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P41036 and previous config saved to /var/cache/conftool/dbconfig/20221124-210338-marostegui.json
  • 20:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T321126)', diff saved to https://phabricator.wikimedia.org/P41035 and previous config saved to /var/cache/conftool/dbconfig/20221124-204832-marostegui.json
  • 20:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2163 (T321126)', diff saved to https://phabricator.wikimedia.org/P41034 and previous config saved to /var/cache/conftool/dbconfig/20221124-204621-marostegui.json
  • 20:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 20:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 20:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T321126)', diff saved to https://phabricator.wikimedia.org/P41033 and previous config saved to /var/cache/conftool/dbconfig/20221124-204600-marostegui.json
  • 20:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P41032 and previous config saved to /var/cache/conftool/dbconfig/20221124-203053-marostegui.json
  • 20:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P41031 and previous config saved to /var/cache/conftool/dbconfig/20221124-201547-marostegui.json
  • 20:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T321126)', diff saved to https://phabricator.wikimedia.org/P41030 and previous config saved to /var/cache/conftool/dbconfig/20221124-200040-marostegui.json
  • 19:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2162 (T321126)', diff saved to https://phabricator.wikimedia.org/P41029 and previous config saved to /var/cache/conftool/dbconfig/20221124-195830-marostegui.json
  • 19:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 19:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 19:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T321126)', diff saved to https://phabricator.wikimedia.org/P41028 and previous config saved to /var/cache/conftool/dbconfig/20221124-195808-marostegui.json
  • 19:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P41027 and previous config saved to /var/cache/conftool/dbconfig/20221124-194302-marostegui.json
  • 19:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P41026 and previous config saved to /var/cache/conftool/dbconfig/20221124-192755-marostegui.json
  • 19:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T321126)', diff saved to https://phabricator.wikimedia.org/P41025 and previous config saved to /var/cache/conftool/dbconfig/20221124-191249-marostegui.json
  • 19:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2161 (T321126)', diff saved to https://phabricator.wikimedia.org/P41024 and previous config saved to /var/cache/conftool/dbconfig/20221124-191038-marostegui.json
  • 19:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 19:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 19:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T321126)', diff saved to https://phabricator.wikimedia.org/P41023 and previous config saved to /var/cache/conftool/dbconfig/20221124-191017-marostegui.json
  • 18:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P41022 and previous config saved to /var/cache/conftool/dbconfig/20221124-185510-marostegui.json
  • 18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P41021 and previous config saved to /var/cache/conftool/dbconfig/20221124-184004-marostegui.json
  • 18:25 mbsantos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
  • 18:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T321126)', diff saved to https://phabricator.wikimedia.org/P41020 and previous config saved to /var/cache/conftool/dbconfig/20221124-182457-marostegui.json
  • 18:23 mbsantos@deploy1002: helmfile [eqiad] START helmfile.d/services/proton: apply
  • 18:22 mbsantos@deploy1002: helmfile [codfw] DONE helmfile.d/services/proton: apply
  • 18:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2154 (T321126)', diff saved to https://phabricator.wikimedia.org/P41019 and previous config saved to /var/cache/conftool/dbconfig/20221124-182247-marostegui.json
  • 18:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 18:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 18:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T321126)', diff saved to https://phabricator.wikimedia.org/P41018 and previous config saved to /var/cache/conftool/dbconfig/20221124-182225-marostegui.json
  • 18:21 mbsantos@deploy1002: helmfile [codfw] START helmfile.d/services/proton: apply
  • 18:20 mbsantos@deploy1002: helmfile [staging] DONE helmfile.d/services/proton: apply
  • 18:19 mbsantos@deploy1002: helmfile [staging] START helmfile.d/services/proton: apply
  • 18:15 mbsantos@deploy1002: helmfile [staging] START helmfile.d/services/proton: apply
  • 18:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P41017 and previous config saved to /var/cache/conftool/dbconfig/20221124-180719-marostegui.json
  • 17:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P41016 and previous config saved to /var/cache/conftool/dbconfig/20221124-175212-marostegui.json
  • 17:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T321126)', diff saved to https://phabricator.wikimedia.org/P41015 and previous config saved to /var/cache/conftool/dbconfig/20221124-173706-marostegui.json
  • 17:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2152 (T321126)', diff saved to https://phabricator.wikimedia.org/P41014 and previous config saved to /var/cache/conftool/dbconfig/20221124-173556-marostegui.json
  • 17:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 17:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 17:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 17:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 17:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 17:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 17:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 17:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 17:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T321126)', diff saved to https://phabricator.wikimedia.org/P41013 and previous config saved to /var/cache/conftool/dbconfig/20221124-173442-marostegui.json
  • 17:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P41012 and previous config saved to /var/cache/conftool/dbconfig/20221124-171936-marostegui.json
  • 17:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 17:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 17:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 17:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 17:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 17:08 urbanecm@deploy1002: Finished scap: Backport for GrowthExperiments: Remove non-existent variables (duration: 05m 25s)
  • 17:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 17:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 17:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 17:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P41011 and previous config saved to /var/cache/conftool/dbconfig/20221124-170429-marostegui.json
  • 17:03 urbanecm@deploy1002: Started scap: Backport for GrowthExperiments: Remove non-existent variables
  • 17:01 urbanecm@deploy1002: backport aborted: (duration: 00m 01s)
  • 16:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T321126)', diff saved to https://phabricator.wikimedia.org/P41010 and previous config saved to /var/cache/conftool/dbconfig/20221124-164923-marostegui.json
  • 16:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1203 (T321126)', diff saved to https://phabricator.wikimedia.org/P41009 and previous config saved to /var/cache/conftool/dbconfig/20221124-164815-marostegui.json
  • 16:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 16:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 16:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T321126)', diff saved to https://phabricator.wikimedia.org/P41008 and previous config saved to /var/cache/conftool/dbconfig/20221124-164754-marostegui.json
  • 16:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P41006 and previous config saved to /var/cache/conftool/dbconfig/20221124-163247-marostegui.json
  • 16:22 SandraEbele: successfully restarted webrequest-druid-daily-coord as part of weekly deployment train.
  • 16:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P41004 and previous config saved to /var/cache/conftool/dbconfig/20221124-161741-marostegui.json
  • 16:15 SandraEbele: killed webrequest-druid-daily-coord for restart as part of weekly deployment train.
  • 16:13 SandraEbele: successfully restarted webrequest-druid-hourly-coord for restart as part of weekly deployment train.
  • 16:11 SandraEbele: killed webrequest-druid-hourly-coord for restart as part of weekly deployment train
  • 16:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T321126)', diff saved to https://phabricator.wikimedia.org/P41003 and previous config saved to /var/cache/conftool/dbconfig/20221124-160234-marostegui.json
  • 16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1193 (T321126)', diff saved to https://phabricator.wikimedia.org/P41002 and previous config saved to /var/cache/conftool/dbconfig/20221124-160026-marostegui.json
  • 16:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 16:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T321126)', diff saved to https://phabricator.wikimedia.org/P41001 and previous config saved to /var/cache/conftool/dbconfig/20221124-160005-marostegui.json
  • 15:45 ebysans@deploy1002: Finished deploy [analytics/refinery@1bfb89f] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@1bfb89f] (duration: 02m 00s)
  • 15:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P41000 and previous config saved to /var/cache/conftool/dbconfig/20221124-154458-marostegui.json
  • 15:43 ebysans@deploy1002: Started deploy [analytics/refinery@1bfb89f] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@1bfb89f]
  • 15:42 ebysans@deploy1002: Finished deploy [analytics/refinery@1bfb89f] (thin): Regular analytics weekly train THIN [analytics/refinery@1bfb89f] (duration: 00m 07s)
  • 15:42 ebysans@deploy1002: Started deploy [analytics/refinery@1bfb89f] (thin): Regular analytics weekly train THIN [analytics/refinery@1bfb89f]
  • 15:41 ebysans@deploy1002: Finished deploy [analytics/refinery@1bfb89f]: Regular analytics weekly train [analytics/refinery@1bfb89f] (duration: 09m 06s)
  • 15:32 ebysans@deploy1002: Started deploy [analytics/refinery@1bfb89f]: Regular analytics weekly train [analytics/refinery@1bfb89f]
  • 15:30 SandraEbele: Started deployment of refinery as part of weekly deployment train
  • 15:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P40999 and previous config saved to /var/cache/conftool/dbconfig/20221124-152952-marostegui.json
  • 15:25 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
  • 15:25 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
  • 15:24 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
  • 15:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 15:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 15:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 15:19 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
  • 15:19 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 15:17 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ printf 'https://en.wikipedia.org/static/images/mobile/copyright/wikipedia-%s.svg\n' {tagline-zh{,-hans},wordmark-zh-hans} | mwscript purgeList.php # T320859
  • 15:16 lucaswerkmeister-wmde@deploy1002: Synchronized static/images/: Config: zhwiki: Revert 20 years logos (T320859) (3/3) (duration: 04m 43s)
  • 15:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T321126)', diff saved to https://phabricator.wikimedia.org/P40998 and previous config saved to /var/cache/conftool/dbconfig/20221124-151445-marostegui.json
  • 15:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 15:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1192 (T321126)', diff saved to https://phabricator.wikimedia.org/P40997 and previous config saved to /var/cache/conftool/dbconfig/20221124-151338-marostegui.json
  • 15:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 15:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 15:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T321126)', diff saved to https://phabricator.wikimedia.org/P40996 and previous config saved to /var/cache/conftool/dbconfig/20221124-151316-marostegui.json
  • 15:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 15:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 15:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 15:11 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/logos.php: Config: zhwiki: Revert 20 years logos (T320859) (2/3) (duration: 04m 34s)
  • 15:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 15:07 lucaswerkmeister-wmde@deploy1002: Synchronized logos/config.yaml: Config: zhwiki: Revert 20 years logos (T320859) (1/3) (duration: 04m 41s)
  • 15:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 15:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 15:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 15:04 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
  • 15:04 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
  • 15:03 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 15:03 isaranto@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 15:01 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
  • 15:01 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
  • 14:59 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
  • 14:59 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/mathoid: apply
  • 14:59 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
  • 14:58 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mathoid: apply
  • 14:58 moritzm: rebalance Ganeti group C/eqiad T311687
  • 14:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P40995 and previous config saved to /var/cache/conftool/dbconfig/20221124-145810-marostegui.json
  • 14:56 isaranto@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 14:56 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 14:53 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
  • 14:53 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mathoid: apply
  • 14:52 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2050.codfw.wmnet with OS bullseye
  • 14:52 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 14:51 isaranto@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 14:50 claime: updating package otelcol-contrib to 0.66.0 in component thirdparty/otelcol-contrib
  • 14:48 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 14:46 isaranto@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 14:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P40994 and previous config saved to /var/cache/conftool/dbconfig/20221124-144303-marostegui.json
  • 14:37 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ printf 'https://en.wikipedia.org/static/images/project-logos/wikidatawiki%s.png\n' '-1.5x' '-2x' | mwscript purgeList.php # T323734
  • 14:36 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for wikidatawiki: Add language-specific logos (T323734) (duration: 17m 24s)
  • 14:35 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
  • 14:31 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
  • 14:29 isaranto@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 14:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T321126)', diff saved to https://phabricator.wikimedia.org/P40993 and previous config saved to /var/cache/conftool/dbconfig/20221124-142756-marostegui.json
  • 14:27 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 14:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 14:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T321126)', diff saved to https://phabricator.wikimedia.org/P40992 and previous config saved to /var/cache/conftool/dbconfig/20221124-142447-marostegui.json
  • 14:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 14:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 14:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 14:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 14:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T321126)', diff saved to https://phabricator.wikimedia.org/P40991 and previous config saved to /var/cache/conftool/dbconfig/20221124-142426-marostegui.json
  • 14:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 14:20 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde and stang: Backport for wikidatawiki: Add language-specific logos (T323734) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 14:19 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for wikidatawiki: Add language-specific logos (T323734)
  • 14:18 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 14:18 isaranto@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 14:13 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 14:11 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P40990 and previous config saved to /var/cache/conftool/dbconfig/20221124-140920-marostegui.json
  • 13:59 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 13:59 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P40989 and previous config saved to /var/cache/conftool/dbconfig/20221124-135413-marostegui.json
  • 13:53 btullis: Removed unused and expiring kafka_jumbo certificates. T323697
  • 13:43 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 13:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T321126)', diff saved to https://phabricator.wikimedia.org/P40988 and previous config saved to /var/cache/conftool/dbconfig/20221124-133907-marostegui.json
  • 13:38 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
  • 13:38 btullis@cumin1001: Added views for new wiki: igwiktionary T314645
  • 13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T321126)', diff saved to https://phabricator.wikimedia.org/P40987 and previous config saved to /var/cache/conftool/dbconfig/20221124-133759-marostegui.json
  • 13:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 13:37 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 13:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 13:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T321126)', diff saved to https://phabricator.wikimedia.org/P40986 and previous config saved to /var/cache/conftool/dbconfig/20221124-133738-marostegui.json
  • 13:30 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 13:30 moritzm: restarting slapd on serpens/seaborgium
  • 13:22 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2050.codfw.wmnet with OS bullseye
  • 13:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P40985 and previous config saved to /var/cache/conftool/dbconfig/20221124-132231-marostegui.json
  • 13:13 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
  • 13:12 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-eqiad
  • 13:11 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-eqiad
  • 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-codfw
  • 13:09 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-codfw
  • 13:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P40984 and previous config saved to /var/cache/conftool/dbconfig/20221124-130725-marostegui.json
  • 13:04 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
  • 13:02 moritzm: installing glibc security updates on buster
  • 13:01 jbond@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2050.codfw.wmnet with reason: host reimage
  • 12:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T321126)', diff saved to https://phabricator.wikimedia.org/P40983 and previous config saved to /var/cache/conftool/dbconfig/20221124-125218-marostegui.json
  • 12:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T321126)', diff saved to https://phabricator.wikimedia.org/P40982 and previous config saved to /var/cache/conftool/dbconfig/20221124-125111-marostegui.json
  • 12:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 12:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 12:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 12:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T321126)', diff saved to https://phabricator.wikimedia.org/P40981 and previous config saved to /var/cache/conftool/dbconfig/20221124-125033-marostegui.json
  • 12:42 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 12:42 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 12:38 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1044.eqiad.wmnet with OS bullseye
  • 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P40980 and previous config saved to /var/cache/conftool/dbconfig/20221124-123527-marostegui.json
  • 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on idp-test1002.wikimedia.org with reason: Testing some changes, service will be down from time to time
  • 12:22 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on idp-test1002.wikimedia.org with reason: Testing some changes, service will be down from time to time
  • 12:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P40979 and previous config saved to /var/cache/conftool/dbconfig/20221124-122020-marostegui.json
  • 12:18 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 12:17 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 12:15 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1044.eqiad.wmnet with reason: host reimage
  • 12:12 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1044.eqiad.wmnet with reason: host reimage
  • 12:07 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T321126)', diff saved to https://phabricator.wikimedia.org/P40978 and previous config saved to /var/cache/conftool/dbconfig/20221124-120514-marostegui.json
  • 11:59 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1044.eqiad.wmnet with OS bullseye
  • 11:52 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
  • 11:51 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
  • 11:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T321126)', diff saved to https://phabricator.wikimedia.org/P40977 and previous config saved to /var/cache/conftool/dbconfig/20221124-115004-marostegui.json
  • 11:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 11:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 11:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 11:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T321126)', diff saved to https://phabricator.wikimedia.org/P40976 and previous config saved to /var/cache/conftool/dbconfig/20221124-114925-marostegui.json
  • 11:48 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
  • 11:46 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
  • 11:45 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
  • 11:44 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
  • 11:43 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 11:40 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
  • 11:39 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
  • 11:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P40974 and previous config saved to /var/cache/conftool/dbconfig/20221124-113418-marostegui.json
  • 11:31 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 11:31 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 11:28 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
  • 11:25 isaranto@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 11:22 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P40973 and previous config saved to /var/cache/conftool/dbconfig/20221124-111912-marostegui.json
  • 11:18 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 11:16 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
  • 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T321126)', diff saved to https://phabricator.wikimedia.org/P40972 and previous config saved to /var/cache/conftool/dbconfig/20221124-110405-marostegui.json
  • 11:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1126 (T321126)', diff saved to https://phabricator.wikimedia.org/P40971 and previous config saved to /var/cache/conftool/dbconfig/20221124-110258-marostegui.json
  • 11:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1126.eqiad.wmnet with reason: Maintenance
  • 11:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1126.eqiad.wmnet with reason: Maintenance
  • 11:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1116.eqiad.wmnet with reason: Maintenance
  • 11:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1116.eqiad.wmnet with reason: Maintenance
  • 11:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T321126)', diff saved to https://phabricator.wikimedia.org/P40970 and previous config saved to /var/cache/conftool/dbconfig/20221124-110220-marostegui.json
  • 10:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P40969 and previous config saved to /var/cache/conftool/dbconfig/20221124-104714-marostegui.json
  • 10:41 akosiaris: reboot rdb1010, rdb1012, rdb2008, rdb2010 for kerne upgrades. All are redis replicas, there should be no impact.
  • 10:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P40968 and previous config saved to /var/cache/conftool/dbconfig/20221124-103207-marostegui.json
  • 10:25 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:23 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 10:23 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 10:20 dcaro@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:20 dcaro@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Removed AAAA entry for all clouddbs - dcaro@cumin1001"
  • 10:19 dcaro@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Removed AAAA entry for all clouddbs - dcaro@cumin1001"
  • 10:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T321126)', diff saved to https://phabricator.wikimedia.org/P40967 and previous config saved to /var/cache/conftool/dbconfig/20221124-101701-marostegui.json
  • 10:16 dcaro@cumin1001: START - Cookbook sre.dns.netbox
  • 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T321126)', diff saved to https://phabricator.wikimedia.org/P40966 and previous config saved to /var/cache/conftool/dbconfig/20221124-101452-marostegui.json
  • 10:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1114.eqiad.wmnet with reason: Maintenance
  • 10:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1114.eqiad.wmnet with reason: Maintenance
  • 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T321126)', diff saved to https://phabricator.wikimedia.org/P40965 and previous config saved to /var/cache/conftool/dbconfig/20221124-101431-marostegui.json
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P40964 and previous config saved to /var/cache/conftool/dbconfig/20221124-095925-marostegui.json
  • 09:59 dcaro@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:59 dcaro@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Removed AAAA entry for clouddb1013 - dcaro@cumin1001"
  • 09:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 09:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 09:57 dcaro@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Removed AAAA entry for clouddb1013 - dcaro@cumin1001"
  • 09:54 dcaro@cumin1001: START - Cookbook sre.dns.netbox
  • 09:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P40963 and previous config saved to /var/cache/conftool/dbconfig/20221124-094418-marostegui.json
  • 09:42 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts graphite2003.codfw.wmnet
  • 09:41 filippo@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:41 filippo@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: graphite2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1001"
  • 09:40 filippo@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: graphite2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1001"
  • 09:38 filippo@cumin1001: START - Cookbook sre.dns.netbox
  • 09:33 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts graphite2003.codfw.wmnet
  • 09:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T321126)', diff saved to https://phabricator.wikimedia.org/P40962 and previous config saved to /var/cache/conftool/dbconfig/20221124-092912-marostegui.json
  • 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1111 (T321126)', diff saved to https://phabricator.wikimedia.org/P40961 and previous config saved to /var/cache/conftool/dbconfig/20221124-092804-marostegui.json
  • 09:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1111.eqiad.wmnet with reason: Maintenance
  • 09:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1111.eqiad.wmnet with reason: Maintenance
  • 09:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T321126)', diff saved to https://phabricator.wikimedia.org/P40960 and previous config saved to /var/cache/conftool/dbconfig/20221124-092742-marostegui.json
  • 09:26 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 09:26 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 09:24 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 09:23 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 09:22 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 09:20 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P40959 and previous config saved to /var/cache/conftool/dbconfig/20221124-091236-marostegui.json
  • 09:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 09:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 09:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T323214)', diff saved to https://phabricator.wikimedia.org/P40958 and previous config saved to /var/cache/conftool/dbconfig/20221124-091017-ladsgroup.json
  • 08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P40957 and previous config saved to /var/cache/conftool/dbconfig/20221124-085729-marostegui.json
  • 08:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P40956 and previous config saved to /var/cache/conftool/dbconfig/20221124-085511-ladsgroup.json
  • 08:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T321126)', diff saved to https://phabricator.wikimedia.org/P40955 and previous config saved to /var/cache/conftool/dbconfig/20221124-084223-marostegui.json
  • 08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1104 (T321126)', diff saved to https://phabricator.wikimedia.org/P40954 and previous config saved to /var/cache/conftool/dbconfig/20221124-084015-marostegui.json
  • 08:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1104.eqiad.wmnet with reason: Maintenance
  • 08:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P40953 and previous config saved to /var/cache/conftool/dbconfig/20221124-084004-ladsgroup.json
  • 08:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1104.eqiad.wmnet with reason: Maintenance
  • 08:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T321126)', diff saved to https://phabricator.wikimedia.org/P40952 and previous config saved to /var/cache/conftool/dbconfig/20221124-083954-marostegui.json
  • 08:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T323214)', diff saved to https://phabricator.wikimedia.org/P40951 and previous config saved to /var/cache/conftool/dbconfig/20221124-082458-ladsgroup.json
  • 08:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P40950 and previous config saved to /var/cache/conftool/dbconfig/20221124-082447-marostegui.json
  • 08:13 moritzm: installing tomcat9 security updates
  • 08:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P40949 and previous config saved to /var/cache/conftool/dbconfig/20221124-080941-marostegui.json
  • 08:04 moritzm: rebalance Ganeti group A/codfw following reboots
  • 07:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T321126)', diff saved to https://phabricator.wikimedia.org/P40948 and previous config saved to /var/cache/conftool/dbconfig/20221124-075434-marostegui.json
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 (T321126)', diff saved to https://phabricator.wikimedia.org/P40947 and previous config saved to /var/cache/conftool/dbconfig/20221124-075226-marostegui.json
  • 07:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 07:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T321126)', diff saved to https://phabricator.wikimedia.org/P40946 and previous config saved to /var/cache/conftool/dbconfig/20221124-075205-marostegui.json
  • 07:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T323214)', diff saved to https://phabricator.wikimedia.org/P40945 and previous config saved to /var/cache/conftool/dbconfig/20221124-074517-ladsgroup.json
  • 07:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P40944 and previous config saved to /var/cache/conftool/dbconfig/20221124-073658-marostegui.json
  • 07:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1201 (T323214)', diff saved to https://phabricator.wikimedia.org/P40943 and previous config saved to /var/cache/conftool/dbconfig/20221124-073637-ladsgroup.json
  • 07:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 07:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 07:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T323214)', diff saved to https://phabricator.wikimedia.org/P40942 and previous config saved to /var/cache/conftool/dbconfig/20221124-073616-ladsgroup.json
  • 07:30 phedenskog@deploy1002: Finished deploy [performance/navtiming@e421904]: (no justification provided) (duration: 00m 08s)
  • 07:30 phedenskog@deploy1002: Started deploy [performance/navtiming@e421904]: (no justification provided)
  • 07:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P40941 and previous config saved to /var/cache/conftool/dbconfig/20221124-073011-ladsgroup.json
  • 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P40940 and previous config saved to /var/cache/conftool/dbconfig/20221124-072152-marostegui.json
  • 07:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P40939 and previous config saved to /var/cache/conftool/dbconfig/20221124-072110-ladsgroup.json
  • 07:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 07:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 07:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P40938 and previous config saved to /var/cache/conftool/dbconfig/20221124-071504-ladsgroup.json
  • 07:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 07:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 07:09 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 07:09 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 07:08 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 07:07 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 07:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 07:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T321126)', diff saved to https://phabricator.wikimedia.org/P40936 and previous config saved to /var/cache/conftool/dbconfig/20221124-070645-marostegui.json
  • 07:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P40935 and previous config saved to /var/cache/conftool/dbconfig/20221124-070603-ladsgroup.json
  • 07:05 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1181 T323117', diff saved to https://phabricator.wikimedia.org/P40934 and previous config saved to /var/cache/conftool/dbconfig/20221124-070546-ladsgroup.json
  • 07:05 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 07:05 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T321126)', diff saved to https://phabricator.wikimedia.org/P40933 and previous config saved to /var/cache/conftool/dbconfig/20221124-070437-marostegui.json
  • 07:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 07:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 07:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 07:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 07:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1136 to s7 primary and set section read-write T323117', diff saved to https://phabricator.wikimedia.org/P40932 and previous config saved to /var/cache/conftool/dbconfig/20221124-070250-ladsgroup.json
  • 07:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 07:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 07:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - T323117', diff saved to https://phabricator.wikimedia.org/P40931 and previous config saved to /var/cache/conftool/dbconfig/20221124-070215-ladsgroup.json
  • 07:02 Amir1: Starting s7 eqiad failover from db1181 to db1136 - T323117
  • 07:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T323214)', diff saved to https://phabricator.wikimedia.org/P40930 and previous config saved to /var/cache/conftool/dbconfig/20221124-065956-ladsgroup.json
  • 06:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2118.codfw.wmnet with reason: Maintenance
  • 06:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2118.codfw.wmnet with reason: Maintenance
  • 06:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T323214)', diff saved to https://phabricator.wikimedia.org/P40929 and previous config saved to /var/cache/conftool/dbconfig/20221124-065057-ladsgroup.json
  • 06:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1136 with weight 0 T323117', diff saved to https://phabricator.wikimedia.org/P40928 and previous config saved to /var/cache/conftool/dbconfig/20221124-060742-ladsgroup.json
  • 06:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s7 T323117
  • 06:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 30 hosts with reason: Primary switchover s7 T323117
  • 06:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1187 (T323214)', diff saved to https://phabricator.wikimedia.org/P40927 and previous config saved to /var/cache/conftool/dbconfig/20221124-060330-ladsgroup.json
  • 06:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 06:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 06:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T323214)', diff saved to https://phabricator.wikimedia.org/P40926 and previous config saved to /var/cache/conftool/dbconfig/20221124-060309-ladsgroup.json
  • 05:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P40925 and previous config saved to /var/cache/conftool/dbconfig/20221124-054802-ladsgroup.json
  • 05:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P40924 and previous config saved to /var/cache/conftool/dbconfig/20221124-053256-ladsgroup.json
  • 05:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2180 (T323214)', diff saved to https://phabricator.wikimedia.org/P40923 and previous config saved to /var/cache/conftool/dbconfig/20221124-052830-ladsgroup.json
  • 05:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 05:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 05:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40922 and previous config saved to /var/cache/conftool/dbconfig/20221124-052808-ladsgroup.json
  • 05:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T323214)', diff saved to https://phabricator.wikimedia.org/P40921 and previous config saved to /var/cache/conftool/dbconfig/20221124-051749-ladsgroup.json
  • 05:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P40920 and previous config saved to /var/cache/conftool/dbconfig/20221124-051301-ladsgroup.json
  • 04:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P40919 and previous config saved to /var/cache/conftool/dbconfig/20221124-045755-ladsgroup.json
  • 04:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40918 and previous config saved to /var/cache/conftool/dbconfig/20221124-044249-ladsgroup.json
  • 04:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T323214)', diff saved to https://phabricator.wikimedia.org/P40917 and previous config saved to /var/cache/conftool/dbconfig/20221124-042757-ladsgroup.json
  • 04:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 04:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 04:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T323214)', diff saved to https://phabricator.wikimedia.org/P40916 and previous config saved to /var/cache/conftool/dbconfig/20221124-042736-ladsgroup.json
  • 04:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P40915 and previous config saved to /var/cache/conftool/dbconfig/20221124-041230-ladsgroup.json
  • 03:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P40914 and previous config saved to /var/cache/conftool/dbconfig/20221124-035723-ladsgroup.json
  • 03:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T323214)', diff saved to https://phabricator.wikimedia.org/P40913 and previous config saved to /var/cache/conftool/dbconfig/20221124-034217-ladsgroup.json
  • 03:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40912 and previous config saved to /var/cache/conftool/dbconfig/20221124-030901-ladsgroup.json
  • 03:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 03:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 03:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40911 and previous config saved to /var/cache/conftool/dbconfig/20221124-030829-ladsgroup.json
  • 03:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T321126)', diff saved to https://phabricator.wikimedia.org/P40910 and previous config saved to /var/cache/conftool/dbconfig/20221124-030025-marostegui.json
  • 02:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P40909 and previous config saved to /var/cache/conftool/dbconfig/20221124-025322-ladsgroup.json
  • 02:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40908 and previous config saved to /var/cache/conftool/dbconfig/20221124-024518-marostegui.json
  • 02:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P40907 and previous config saved to /var/cache/conftool/dbconfig/20221124-023816-ladsgroup.json
  • 02:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T323214)', diff saved to https://phabricator.wikimedia.org/P40906 and previous config saved to /var/cache/conftool/dbconfig/20221124-023500-ladsgroup.json
  • 02:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 02:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 02:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T323214)', diff saved to https://phabricator.wikimedia.org/P40905 and previous config saved to /var/cache/conftool/dbconfig/20221124-023428-ladsgroup.json
  • 02:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40904 and previous config saved to /var/cache/conftool/dbconfig/20221124-023011-marostegui.json
  • 02:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40903 and previous config saved to /var/cache/conftool/dbconfig/20221124-022309-ladsgroup.json
  • 02:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P40902 and previous config saved to /var/cache/conftool/dbconfig/20221124-021921-ladsgroup.json
  • 02:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T321126)', diff saved to https://phabricator.wikimedia.org/P40901 and previous config saved to /var/cache/conftool/dbconfig/20221124-021505-marostegui.json
  • 02:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2182 (T321126)', diff saved to https://phabricator.wikimedia.org/P40900 and previous config saved to /var/cache/conftool/dbconfig/20221124-021233-marostegui.json
  • 02:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 02:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 02:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40899 and previous config saved to /var/cache/conftool/dbconfig/20221124-021211-marostegui.json
  • 02:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P40898 and previous config saved to /var/cache/conftool/dbconfig/20221124-020415-ladsgroup.json
  • 01:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P40897 and previous config saved to /var/cache/conftool/dbconfig/20221124-015705-marostegui.json
  • 01:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T323214)', diff saved to https://phabricator.wikimedia.org/P40896 and previous config saved to /var/cache/conftool/dbconfig/20221124-014908-ladsgroup.json
  • 01:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P40895 and previous config saved to /var/cache/conftool/dbconfig/20221124-014158-marostegui.json
  • 01:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40894 and previous config saved to /var/cache/conftool/dbconfig/20221124-012652-marostegui.json
  • 01:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40893 and previous config saved to /var/cache/conftool/dbconfig/20221124-012420-marostegui.json
  • 01:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 01:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 01:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40892 and previous config saved to /var/cache/conftool/dbconfig/20221124-012409-marostegui.json
  • 01:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P40891 and previous config saved to /var/cache/conftool/dbconfig/20221124-010903-marostegui.json
  • 00:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P40890 and previous config saved to /var/cache/conftool/dbconfig/20221124-005357-marostegui.json
  • 00:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40889 and previous config saved to /var/cache/conftool/dbconfig/20221124-004510-ladsgroup.json
  • 00:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 00:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 00:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T323214)', diff saved to https://phabricator.wikimedia.org/P40888 and previous config saved to /var/cache/conftool/dbconfig/20221124-004448-ladsgroup.json
  • 00:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T323214)', diff saved to https://phabricator.wikimedia.org/P40887 and previous config saved to /var/cache/conftool/dbconfig/20221124-004006-ladsgroup.json
  • 00:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 00:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 00:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 00:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 00:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40886 and previous config saved to /var/cache/conftool/dbconfig/20221124-003850-marostegui.json
  • 00:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40885 and previous config saved to /var/cache/conftool/dbconfig/20221124-003618-marostegui.json
  • 00:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 00:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 00:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T321126)', diff saved to https://phabricator.wikimedia.org/P40884 and previous config saved to /var/cache/conftool/dbconfig/20221124-003556-marostegui.json
  • 00:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P40883 and previous config saved to /var/cache/conftool/dbconfig/20221124-002941-ladsgroup.json
  • 00:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P40882 and previous config saved to /var/cache/conftool/dbconfig/20221124-002050-marostegui.json
  • 00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P40881 and previous config saved to /var/cache/conftool/dbconfig/20221124-001435-ladsgroup.json
  • 00:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P40880 and previous config saved to /var/cache/conftool/dbconfig/20221124-000543-marostegui.json

2022-11-23

  • 23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T323214)', diff saved to https://phabricator.wikimedia.org/P40879 and previous config saved to /var/cache/conftool/dbconfig/20221123-235928-ladsgroup.json
  • 23:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T321126)', diff saved to https://phabricator.wikimedia.org/P40878 and previous config saved to /var/cache/conftool/dbconfig/20221123-235037-marostegui.json
  • 23:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2159 (T321126)', diff saved to https://phabricator.wikimedia.org/P40877 and previous config saved to /var/cache/conftool/dbconfig/20221123-234806-marostegui.json
  • 23:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 23:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 23:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 23:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 23:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T321126)', diff saved to https://phabricator.wikimedia.org/P40876 and previous config saved to /var/cache/conftool/dbconfig/20221123-234729-marostegui.json
  • 23:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40875 and previous config saved to /var/cache/conftool/dbconfig/20221123-233222-marostegui.json
  • 23:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40874 and previous config saved to /var/cache/conftool/dbconfig/20221123-231716-marostegui.json
  • 23:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 23:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T323214)', diff saved to https://phabricator.wikimedia.org/P40872 and previous config saved to /var/cache/conftool/dbconfig/20221123-230624-ladsgroup.json
  • 23:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T321126)', diff saved to https://phabricator.wikimedia.org/P40871 and previous config saved to /var/cache/conftool/dbconfig/20221123-230209-marostegui.json
  • 22:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2150 (T321126)', diff saved to https://phabricator.wikimedia.org/P40870 and previous config saved to /var/cache/conftool/dbconfig/20221123-225937-marostegui.json
  • 22:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 22:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 22:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T321126)', diff saved to https://phabricator.wikimedia.org/P40869 and previous config saved to /var/cache/conftool/dbconfig/20221123-225916-marostegui.json
  • 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P40868 and previous config saved to /var/cache/conftool/dbconfig/20221123-225118-ladsgroup.json
  • 22:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40866 and previous config saved to /var/cache/conftool/dbconfig/20221123-224409-marostegui.json
  • 22:40 jbond@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2050.codfw.wmnet with OS bullseye
  • 22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P40865 and previous config saved to /var/cache/conftool/dbconfig/20221123-223611-ladsgroup.json
  • 22:31 cstone: civicrm upgraded from fca1c8a6 to efff01e9
  • 22:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40864 and previous config saved to /var/cache/conftool/dbconfig/20221123-222903-marostegui.json
  • 22:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2158 (T323214)', diff saved to https://phabricator.wikimedia.org/P40862 and previous config saved to /var/cache/conftool/dbconfig/20221123-222627-ladsgroup.json
  • 22:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 22:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 22:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 22:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T323214)', diff saved to https://phabricator.wikimedia.org/P40861 and previous config saved to /var/cache/conftool/dbconfig/20221123-222105-ladsgroup.json
  • 22:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T321126)', diff saved to https://phabricator.wikimedia.org/P40860 and previous config saved to /var/cache/conftool/dbconfig/20221123-221356-marostegui.json
  • 22:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2122 (T321126)', diff saved to https://phabricator.wikimedia.org/P40859 and previous config saved to /var/cache/conftool/dbconfig/20221123-221125-marostegui.json
  • 22:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 22:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 22:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T321126)', diff saved to https://phabricator.wikimedia.org/P40858 and previous config saved to /var/cache/conftool/dbconfig/20221123-221103-marostegui.json
  • 22:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 22:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 22:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 22:02 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 21:59 reedy@deploy1002: Synchronized php-1.40.0-wmf.10/includes/language/Message.php: T323236 (duration: 04m 35s)
  • 21:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 21:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 21:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 21:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40857 and previous config saved to /var/cache/conftool/dbconfig/20221123-215557-marostegui.json
  • 21:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 21:54 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host arclamp1001.eqiad.wmnet with OS bullseye
  • 21:48 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 21:48 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 21:45 pt1979@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1054']
  • 21:44 pt1979@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1054']
  • 21:44 pt1979@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1054']
  • 21:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40855 and previous config saved to /var/cache/conftool/dbconfig/20221123-214050-marostegui.json
  • 21:38 brennen: end of utc late backport and config window
  • 21:38 brennen@deploy1002: Finished scap: Backport for Update ky wikipedia logo (T323722) (duration: 06m 17s)
  • 21:35 pt1979@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1054']
  • 21:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 21:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 21:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 21:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T323214)', diff saved to https://phabricator.wikimedia.org/P40854 and previous config saved to /var/cache/conftool/dbconfig/20221123-213357-ladsgroup.json
  • 21:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 21:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1131.eqiad.wmnet with reason: Maintenance
  • 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40853 and previous config saved to /var/cache/conftool/dbconfig/20221123-213335-ladsgroup.json
  • 21:33 brennen@deploy1002: brennen and jdlrobson: Backport for Update ky wikipedia logo (T323722) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 21:31 brennen@deploy1002: Started scap: Backport for Update ky wikipedia logo (T323722)
  • 21:31 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 21:31 jdrewniak@deploy1002: backport aborted: (duration: 02m 40s)
  • 21:31 jdrewniak@deploy1002: sync-world aborted: Backport for Update ky wikipedia logo (T323722) (duration: 01m 38s)
  • 21:31 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1061.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:31 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be2050.codfw.wmnet with OS bullseye
  • 21:29 jdrewniak@deploy1002: Started scap: Backport for Update ky wikipedia logo (T323722)
  • 21:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T321126)', diff saved to https://phabricator.wikimedia.org/P40852 and previous config saved to /var/cache/conftool/dbconfig/20221123-212543-marostegui.json
  • 21:24 brennen@deploy1002: Finished scap: Backport for Update favicon and CentralAuthLoginIcon for wikifunctionswiki (T323627) (duration: 06m 29s)
  • 21:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 21:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2121 (T321126)', diff saved to https://phabricator.wikimedia.org/P40851 and previous config saved to /var/cache/conftool/dbconfig/20221123-212312-marostegui.json
  • 21:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 21:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 21:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 21:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 21:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T321126)', diff saved to https://phabricator.wikimedia.org/P40850 and previous config saved to /var/cache/conftool/dbconfig/20221123-212250-marostegui.json
  • 21:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 21:19 brennen@deploy1002: brennen and stang: Backport for Update favicon and CentralAuthLoginIcon for wikifunctionswiki (T323627) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
  • 21:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P40849 and previous config saved to /var/cache/conftool/dbconfig/20221123-211829-ladsgroup.json
  • 21:18 brennen@deploy1002: Started scap: Backport for Update favicon and CentralAuthLoginIcon for wikifunctionswiki (T323627)
  • 21:16 cjming@deploy1002: backport aborted: (duration: 06m 39s)
  • 21:16 cjming@deploy1002: sync-world aborted: Backport for Update favicon and CentralAuthLoginIcon for wikifunctionswiki (T323627) (duration: 06m 24s)
  • 21:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 21:12 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1061.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:11 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1061.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 21:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 21:11 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1061.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 21:10 cjming@deploy1002: Started scap: Backport for Update favicon and CentralAuthLoginIcon for wikifunctionswiki (T323627)
  • 21:08 cjming@deploy1002: scap failed: CalledProcessError Command 'sudo -u mwbuilder /usr/local/bin/update-mediawiki-tools-release' returned non-zero exit status 1. (duration: 02m 57s)
  • 21:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40848 and previous config saved to /var/cache/conftool/dbconfig/20221123-210744-marostegui.json
  • 21:06 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1060.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:05 cjming@deploy1002: Started scap: Backport for Update favicon and CentralAuthLoginIcon for wikifunctionswiki (T323627)
  • 21:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P40846 and previous config saved to /var/cache/conftool/dbconfig/20221123-210322-ladsgroup.json
  • 20:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 20:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 20:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T323214)', diff saved to https://phabricator.wikimedia.org/P40845 and previous config saved to /var/cache/conftool/dbconfig/20221123-205926-ladsgroup.json
  • 20:59 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 20:57 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ms-be2050.codfw.wmnet with OS bullseye
  • 20:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40844 and previous config saved to /var/cache/conftool/dbconfig/20221123-205238-marostegui.json
  • 20:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host arclamp1001.eqiad.wmnet with OS bullseye
  • 20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40843 and previous config saved to /var/cache/conftool/dbconfig/20221123-204816-ladsgroup.json
  • 20:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P40842 and previous config saved to /var/cache/conftool/dbconfig/20221123-204420-ladsgroup.json
  • 20:41 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host arclamp1001.eqiad.wmnet with OS bullseye
  • 20:40 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 20:38 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 20:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T321126)', diff saved to https://phabricator.wikimedia.org/P40841 and previous config saved to /var/cache/conftool/dbconfig/20221123-203731-marostegui.json
  • 20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2120 (T321126)', diff saved to https://phabricator.wikimedia.org/P40840 and previous config saved to /var/cache/conftool/dbconfig/20221123-203459-marostegui.json
  • 20:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 20:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 20:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T321126)', diff saved to https://phabricator.wikimedia.org/P40839 and previous config saved to /var/cache/conftool/dbconfig/20221123-203437-marostegui.json
  • 20:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P40838 and previous config saved to /var/cache/conftool/dbconfig/20221123-202914-ladsgroup.json
  • 20:20 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1060.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:20 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1060.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40837 and previous config saved to /var/cache/conftool/dbconfig/20221123-201931-marostegui.json
  • 20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T323214)', diff saved to https://phabricator.wikimedia.org/P40836 and previous config saved to /var/cache/conftool/dbconfig/20221123-201407-ladsgroup.json
  • 20:08 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1060.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:07 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1059.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for phab1004.eqiad.wmnet
  • 20:06 dzahn@cumin2002: START - Cookbook sre.hosts.remove-downtime for phab1004.eqiad.wmnet
  • 20:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40835 and previous config saved to /var/cache/conftool/dbconfig/20221123-200424-marostegui.json
  • 20:03 sukhe: running homer for Gerrit: 860103
  • 20:03 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 20:02 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 19:59 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts lvs4007.ulsfo.wmnet
  • 19:59 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:59 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs4007.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 19:51 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs4007.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
  • 19:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T321126)', diff saved to https://phabricator.wikimedia.org/P40833 and previous config saved to /var/cache/conftool/dbconfig/20221123-194918-marostegui.json
  • 19:48 sukhe@cumin2002: START - Cookbook sre.dns.netbox
  • 19:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2108 (T321126)', diff saved to https://phabricator.wikimedia.org/P40832 and previous config saved to /var/cache/conftool/dbconfig/20221123-194646-marostegui.json
  • 19:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 19:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 19:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 19:45 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 19:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 19:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 19:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 19:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T321126)', diff saved to https://phabricator.wikimedia.org/P40831 and previous config saved to /var/cache/conftool/dbconfig/20221123-194441-marostegui.json
  • 19:43 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 19:41 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs4007.ulsfo.wmnet
  • 19:41 sukhe: decommission lvs4007: T317247
  • 19:39 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint1002.wikimedia.org with OS buster
  • 19:39 sukhe: [done] running homer for Gerrit: 860089
  • 19:38 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1059.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:37 mutante: phab1004 - re-enabling puppet - phd should stay stopped, dumps and logmail should keep running
  • 19:37 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1059.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:37 sukhe: running homer for Gerrit: 860089
  • 19:35 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1059.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:34 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1058.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40830 and previous config saved to /var/cache/conftool/dbconfig/20221123-192934-marostegui.json
  • 19:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host arclamp1001.eqiad.wmnet with OS bullseye
  • 19:26 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS buster
  • 19:24 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint1002.wikimedia.org with reason: host reimage
  • 19:21 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint1002.wikimedia.org with reason: host reimage
  • 19:16 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 19:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40829 and previous config saved to /var/cache/conftool/dbconfig/20221123-191427-marostegui.json
  • 19:13 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2050.codfw.wmnet with OS bullseye
  • 19:09 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host contint1002.wikimedia.org with OS buster
  • 19:09 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
  • 19:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40828 and previous config saved to /var/cache/conftool/dbconfig/20221123-190812-ladsgroup.json
  • 19:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 19:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40827 and previous config saved to /var/cache/conftool/dbconfig/20221123-190739-ladsgroup.json
  • 19:06 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1058.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:05 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
  • 19:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['arclamp1001']
  • 19:04 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1057.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T321126)', diff saved to https://phabricator.wikimedia.org/P40826 and previous config saved to /var/cache/conftool/dbconfig/20221123-185920-marostegui.json
  • 18:56 btullis@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons.
  • 18:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1202 (T321126)', diff saved to https://phabricator.wikimedia.org/P40825 and previous config saved to /var/cache/conftool/dbconfig/20221123-185505-marostegui.json
  • 18:55 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['arclamp1001']
  • 18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 18:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T321126)', diff saved to https://phabricator.wikimedia.org/P40824 and previous config saved to /var/cache/conftool/dbconfig/20221123-185444-marostegui.json
  • 18:53 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2050.codfw.wmnet with OS bullseye
  • 18:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P40823 and previous config saved to /var/cache/conftool/dbconfig/20221123-185233-ladsgroup.json
  • 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host arclamp1001.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:45 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS buster
  • 18:42 sukhe: restart pybal on lvs4007.ulsfo.wmnet
  • 18:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2129 (T323214)', diff saved to https://phabricator.wikimedia.org/P40822 and previous config saved to /var/cache/conftool/dbconfig/20221123-184207-ladsgroup.json
  • 18:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 18:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T323214)', diff saved to https://phabricator.wikimedia.org/P40821 and previous config saved to /var/cache/conftool/dbconfig/20221123-184145-ladsgroup.json
  • 18:41 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host arclamp1001.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40820 and previous config saved to /var/cache/conftool/dbconfig/20221123-183937-marostegui.json
  • 18:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P40819 and previous config saved to /var/cache/conftool/dbconfig/20221123-183726-ladsgroup.json
  • 18:37 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1057.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:36 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1056.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P40818 and previous config saved to /var/cache/conftool/dbconfig/20221123-182638-ladsgroup.json
  • 18:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40817 and previous config saved to /var/cache/conftool/dbconfig/20221123-182431-marostegui.json
  • 18:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40816 and previous config saved to /var/cache/conftool/dbconfig/20221123-182220-ladsgroup.json
  • 18:12 ryankemper@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic cluster restart; prev restart was done before some hosts had ran puppet - ryankemper@cumin1001 - T319020
  • 18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P40815 and previous config saved to /var/cache/conftool/dbconfig/20221123-181132-ladsgroup.json
  • 18:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T321126)', diff saved to https://phabricator.wikimedia.org/P40814 and previous config saved to /var/cache/conftool/dbconfig/20221123-180924-marostegui.json
  • 18:08 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/proton: apply
  • 18:08 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/proton: apply
  • 18:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1194 (T321126)', diff saved to https://phabricator.wikimedia.org/P40813 and previous config saved to /var/cache/conftool/dbconfig/20221123-180709-marostegui.json
  • 18:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 18:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 18:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T321126)', diff saved to https://phabricator.wikimedia.org/P40812 and previous config saved to /var/cache/conftool/dbconfig/20221123-180648-marostegui.json
  • 18:04 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
  • 18:03 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/proton: apply
  • 18:03 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/proton: apply
  • 18:02 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/proton: apply
  • 18:01 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1056.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:00 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1055.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T323214)', diff saved to https://phabricator.wikimedia.org/P40810 and previous config saved to /var/cache/conftool/dbconfig/20221123-175625-ladsgroup.json
  • 17:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40809 and previous config saved to /var/cache/conftool/dbconfig/20221123-175141-marostegui.json
  • 17:44 ryankemper: [Elastic] T319020 Kicked off rolling restart of cloudelastic to apply new heap size 8->10G; see `ryankemper@cumin1001` tmux session `cloudelastic_restarts`
  • 17:42 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic cluster restart; prev restart was done before some hosts had ran puppet - ryankemper@cumin1001 - T319020
  • 17:42 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1055.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:39 urandom: initiating Cassandra bootstrap, aqs1018-a -- T307802
  • 17:37 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1055.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:36 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1055.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40807 and previous config saved to /var/cache/conftool/dbconfig/20221123-173635-marostegui.json
  • 17:33 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching aqs[2001-2004].codfw.wmnet,aqs[1010-1015].eqiad.wmnet: T314309 restarting to pick up new JRE - eevans@cumin1001
  • 17:27 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1054.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:22 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/proton: apply
  • 17:21 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/proton: apply
  • 17:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T321126)', diff saved to https://phabricator.wikimedia.org/P40806 and previous config saved to /var/cache/conftool/dbconfig/20221123-172128-marostegui.json
  • 17:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1191 (T321126)', diff saved to https://phabricator.wikimedia.org/P40805 and previous config saved to /var/cache/conftool/dbconfig/20221123-171911-marostegui.json
  • 17:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 17:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 17:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T321126)', diff saved to https://phabricator.wikimedia.org/P40804 and previous config saved to /var/cache/conftool/dbconfig/20221123-171850-marostegui.json
  • 17:18 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:18 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for arclamp1001 - pt1979@cumin2002"
  • 17:16 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for arclamp1001 - pt1979@cumin2002"
  • 17:12 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 17:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40803 and previous config saved to /var/cache/conftool/dbconfig/20221123-170343-marostegui.json
  • 16:57 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1054.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:56 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1054.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:56 pt1979@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['contint1002']
  • 16:52 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirt1054.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40802 and previous config saved to /var/cache/conftool/dbconfig/20221123-164837-marostegui.json
  • 16:46 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/image-suggestion: apply
  • 16:45 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/image-suggestion: apply
  • 16:43 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/image-suggestion: apply
  • 16:42 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/image-suggestion: apply
  • 16:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40801 and previous config saved to /var/cache/conftool/dbconfig/20221123-163412-ladsgroup.json
  • 16:34 pt1979@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint1002']
  • 16:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 16:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 16:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40800 and previous config saved to /var/cache/conftool/dbconfig/20221123-163351-ladsgroup.json
  • 16:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T321126)', diff saved to https://phabricator.wikimedia.org/P40799 and previous config saved to /var/cache/conftool/dbconfig/20221123-163330-marostegui.json
  • 16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T321126)', diff saved to https://phabricator.wikimedia.org/P40798 and previous config saved to /var/cache/conftool/dbconfig/20221123-163115-marostegui.json
  • 16:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 16:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 16:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40797 and previous config saved to /var/cache/conftool/dbconfig/20221123-163018-marostegui.json
  • 16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2124 (T323214)', diff saved to https://phabricator.wikimedia.org/P40796 and previous config saved to /var/cache/conftool/dbconfig/20221123-162407-ladsgroup.json
  • 16:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 16:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T323214)', diff saved to https://phabricator.wikimedia.org/P40795 and previous config saved to /var/cache/conftool/dbconfig/20221123-162345-ladsgroup.json
  • 16:23 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P40794 and previous config saved to /var/cache/conftool/dbconfig/20221123-161844-ladsgroup.json
  • 16:17 eevans@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching aqs[2001-2004].codfw.wmnet,aqs[1010-1015].eqiad.wmnet: T314309 restarting to pick up new JRE - eevans@cumin1001
  • 16:16 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:16 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40793 and previous config saved to /var/cache/conftool/dbconfig/20221123-161512-marostegui.json
  • 16:10 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
  • 16:09 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/thumbor: sync
  • 16:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P40792 and previous config saved to /var/cache/conftool/dbconfig/20221123-160837-ladsgroup.json
  • 16:08 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 16:07 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 16:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P40791 and previous config saved to /var/cache/conftool/dbconfig/20221123-160338-ladsgroup.json
  • 16:03 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P40790 and previous config saved to /var/cache/conftool/dbconfig/20221123-160022-ladsgroup.json
  • 16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40789 and previous config saved to /var/cache/conftool/dbconfig/20221123-160005-marostegui.json
  • 15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P40788 and previous config saved to /var/cache/conftool/dbconfig/20221123-155330-ladsgroup.json
  • 15:53 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 15:52 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 15:52 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 15:51 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 15:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40787 and previous config saved to /var/cache/conftool/dbconfig/20221123-154831-ladsgroup.json
  • 15:45 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:45 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating for lvs4009 and lvs4010 - sukhe@cumin2002"
  • 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P40786 and previous config saved to /var/cache/conftool/dbconfig/20221123-154517-ladsgroup.json
  • 15:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40785 and previous config saved to /var/cache/conftool/dbconfig/20221123-154459-marostegui.json
  • 15:44 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating for lvs4009 and lvs4010 - sukhe@cumin2002"
  • 15:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40784 and previous config saved to /var/cache/conftool/dbconfig/20221123-154242-marostegui.json
  • 15:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 15:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 15:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T321126)', diff saved to https://phabricator.wikimedia.org/P40783 and previous config saved to /var/cache/conftool/dbconfig/20221123-154220-marostegui.json
  • 15:42 sukhe@cumin2002: START - Cookbook sre.dns.netbox
  • 15:41 btullis@cumin2002: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons.
  • 15:41 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 15:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T323214)', diff saved to https://phabricator.wikimedia.org/P40782 and previous config saved to /var/cache/conftool/dbconfig/20221123-153824-ladsgroup.json
  • 15:35 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:31 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
  • 15:30 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
  • 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P40780 and previous config saved to /var/cache/conftool/dbconfig/20221123-153012-ladsgroup.json
  • 15:29 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:29 jforrester@deploy1002: Finished deploy [integration/docroot@52e4a00]: Deploying 52e4a00 for T311097 pointing Codex docs to latest (duration: 00m 14s)
  • 15:28 jforrester@deploy1002: Started deploy [integration/docroot@52e4a00]: Deploying 52e4a00 for T311097 pointing Codex docs to latest
  • 15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40779 and previous config saved to /var/cache/conftool/dbconfig/20221123-152714-marostegui.json
  • 15:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 15:15 moritzm: updating snapshot* hosts to PHP 7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u1 T323358
  • 15:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P40778 and previous config saved to /var/cache/conftool/dbconfig/20221123-151507-ladsgroup.json
  • 15:13 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40777 and previous config saved to /var/cache/conftool/dbconfig/20221123-151207-marostegui.json
  • 15:11 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host contint1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:10 claime: deploying change 859575 on mw-* wikikube deployments
  • 15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
  • 15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
  • 15:09 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 15:09 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 15:08 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 15:08 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 15:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T321312)', diff saved to https://phabricator.wikimedia.org/P40776 and previous config saved to /var/cache/conftool/dbconfig/20221123-150719-ladsgroup.json
  • 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1132 Maint', diff saved to https://phabricator.wikimedia.org/P40775 and previous config saved to /var/cache/conftool/dbconfig/20221123-150621-ladsgroup.json
  • 14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T321126)', diff saved to https://phabricator.wikimedia.org/P40774 and previous config saved to /var/cache/conftool/dbconfig/20221123-145701-marostegui.json
  • 14:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T321126)', diff saved to https://phabricator.wikimedia.org/P40773 and previous config saved to /var/cache/conftool/dbconfig/20221123-145446-marostegui.json
  • 14:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 14:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 14:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 14:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 14:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40772 and previous config saved to /var/cache/conftool/dbconfig/20221123-145212-ladsgroup.json
  • 14:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T321126)', diff saved to https://phabricator.wikimedia.org/P40771 and previous config saved to /var/cache/conftool/dbconfig/20221123-144735-marostegui.json
  • 14:41 moritzm: rebalance Ganeti group B/eqiad T311687
  • 14:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40770 and previous config saved to /var/cache/conftool/dbconfig/20221123-143706-ladsgroup.json
  • 14:36 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1045.eqiad.wmnet with OS bullseye
  • 14:32 cmooney@cumin1001: START - Cookbook sre.dns.netbox
  • 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40769 and previous config saved to /var/cache/conftool/dbconfig/20221123-143228-marostegui.json
  • 14:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T321312)', diff saved to https://phabricator.wikimedia.org/P40768 and previous config saved to /var/cache/conftool/dbconfig/20221123-142159-ladsgroup.json
  • 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40767 and previous config saved to /var/cache/conftool/dbconfig/20221123-141722-marostegui.json
  • 14:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T321312)', diff saved to https://phabricator.wikimedia.org/P40766 and previous config saved to /var/cache/conftool/dbconfig/20221123-141543-ladsgroup.json
  • 14:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 14:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 14:15 cgoubert@cumin1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=mw-api-ext
  • 14:14 cgoubert@cumin1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=mw-web
  • 14:14 cgoubert@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=mw-api-ext-ro
  • 14:14 cgoubert@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=mw-web-ro
  • 14:10 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1045.eqiad.wmnet with reason: host reimage
  • 14:07 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1027.eqiad.wmnet to cluster eqiad and group C
  • 14:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2117 (T323214)', diff saved to https://phabricator.wikimedia.org/P40765 and previous config saved to /var/cache/conftool/dbconfig/20221123-140732-ladsgroup.json
  • 14:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 14:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 (T323214)', diff saved to https://phabricator.wikimedia.org/P40764 and previous config saved to /var/cache/conftool/dbconfig/20221123-140712-ladsgroup.json
  • 14:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 14:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 14:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 14:06 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1045.eqiad.wmnet with reason: host reimage
  • 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T321126)', diff saved to https://phabricator.wikimedia.org/P40763 and previous config saved to /var/cache/conftool/dbconfig/20221123-140215-marostegui.json
  • 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1027.eqiad.wmnet to cluster eqiad and group C
  • 13:53 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1045.eqiad.wmnet with OS bullseye
  • 13:39 moritzm: updating mw canaries to 7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u1 T323358
  • 13:25 moritzm: installing apache security updates on mw canaries
  • 13:02 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1046.eqiad.wmnet with OS bullseye
  • 13:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1136 (T321126)', diff saved to https://phabricator.wikimedia.org/P40762 and previous config saved to /var/cache/conftool/dbconfig/20221123-130159-marostegui.json
  • 13:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 13:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T321126)', diff saved to https://phabricator.wikimedia.org/P40761 and previous config saved to /var/cache/conftool/dbconfig/20221123-130138-marostegui.json
  • 12:58 cgoubert@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on D{lvs2009.codfw.wmnet,lvs1019.eqiad.wmnet} and A:lvs (T323621)
  • 12:58 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 12:55 cgoubert@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on D{lvs2009.codfw.wmnet,lvs1019.eqiad.wmnet} and A:lvs (T323621)
  • 12:52 cgoubert@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on D{lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet} and A:lvs (T323621)
  • 12:49 cgoubert@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on D{lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet} and A:lvs (T323621)
  • 12:48 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40760 and previous config saved to /var/cache/conftool/dbconfig/20221123-124631-marostegui.json
  • 12:43 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sretest1002.eqiad.wmnet
  • 12:36 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
  • 12:36 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1046.eqiad.wmnet with reason: host reimage
  • 12:33 cgoubert@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on D{lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet} and A:lvs (T323621)
  • 12:32 claime: restarting pybal on lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet for mw-web and mw-api-ext behind LVS T323621
  • 12:32 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1046.eqiad.wmnet with reason: host reimage
  • 12:32 cgoubert@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on D{lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet} and A:lvs (T323621)
  • 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40759 and previous config saved to /var/cache/conftool/dbconfig/20221123-123125-marostegui.json
  • 12:19 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1046.eqiad.wmnet with OS bullseye
  • 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T321126)', diff saved to https://phabricator.wikimedia.org/P40758 and previous config saved to /var/cache/conftool/dbconfig/20221123-121618-marostegui.json
  • 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T321126)', diff saved to https://phabricator.wikimedia.org/P40756 and previous config saved to /var/cache/conftool/dbconfig/20221123-121402-marostegui.json
  • 12:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 12:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 12:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40755 and previous config saved to /var/cache/conftool/dbconfig/20221123-121340-marostegui.json
  • 12:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 12:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 12:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 12:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 12:01 lucaswerkmeister-wmde:: Deployed security patch for T323592
  • 11:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40754 and previous config saved to /var/cache/conftool/dbconfig/20221123-115834-marostegui.json
  • 11:55 moritzm: updating mw canaries to 7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u1 T323358
  • 11:52 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host cloudvirt1047.eqiad.wmnet with OS bullseye
  • 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1002.eqiad.wmnet
  • 11:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40753 and previous config saved to /var/cache/conftool/dbconfig/20221123-114327-marostegui.json
  • 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1002.eqiad.wmnet
  • 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2002.codfw.wmnet
  • 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2002.codfw.wmnet
  • 11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40752 and previous config saved to /var/cache/conftool/dbconfig/20221123-112821-marostegui.json
  • 11:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40751 and previous config saved to /var/cache/conftool/dbconfig/20221123-112604-marostegui.json
  • 11:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 11:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 11:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40750 and previous config saved to /var/cache/conftool/dbconfig/20221123-112542-marostegui.json
  • 11:24 topranks: changing port-speed configuration syntax on asw1-b12-drmrs
  • 11:23 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1047.eqiad.wmnet with reason: host reimage
  • 11:22 claime: authdns-update for mw-web and mw-api-ext
  • 11:20 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1047.eqiad.wmnet with reason: host reimage
  • 11:15 claime: Adding mw-web and mw-api-ext to wmnet dns
  • 11:14 volans@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Test - volans@cumin1001"
  • 11:12 volans@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Test - volans@cumin1001"
  • 11:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40748 and previous config saved to /var/cache/conftool/dbconfig/20221123-111036-marostegui.json
  • 11:06 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1047.eqiad.wmnet with OS bullseye
  • 10:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40747 and previous config saved to /var/cache/conftool/dbconfig/20221123-105529-marostegui.json
  • 10:49 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 10:48 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 10:47 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 10:46 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 10:45 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 10:42 isaranto@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 10:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40746 and previous config saved to /var/cache/conftool/dbconfig/20221123-104023-marostegui.json
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40745 and previous config saved to /var/cache/conftool/dbconfig/20221123-103805-marostegui.json
  • 10:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 10:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 10:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
  • 10:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
  • 10:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin1001.eqiad.wmnet
  • 10:08 jbond@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "final sync before merging 804575 - jbond@cumin2002"
  • 10:05 jbond@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "final sync before merging 804575 - jbond@cumin2002"
  • 10:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin1001.eqiad.wmnet
  • 09:42 stevemunene@deploy1002: Finished deploy [analytics/turnilo/deploy@51da050]: (no justification provided) (duration: 00m 05s)
  • 09:42 stevemunene@deploy1002: Started deploy [analytics/turnilo/deploy@51da050]: (no justification provided)
  • 09:33 stevemunene@deploy1002: Finished deploy [analytics/turnilo/deploy@51da050]: (no justification provided) (duration: 00m 15s)
  • 09:33 stevemunene@deploy1002: Started deploy [analytics/turnilo/deploy@51da050]: (no justification provided)
  • 09:19 elukey: restart kube-apiserver on ml-staging-ctrl2001 as attempt to mitigate weird LIST latencies
  • 09:16 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 09:16 Emperor: set thanos ring replicas to 3.10 T311690
  • 09:15 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 09:14 elukey: restart kube-apiserver on ml-serve-ctrl1001 as attempt to mitigate weird LIST latencies
  • 09:12 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 09:11 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 09:06 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 09:06 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 08:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1027.eqiad.wmnet with OS bullseye
  • 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1027.eqiad.wmnet with reason: host reimage
  • 08:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1027.eqiad.wmnet with reason: host reimage
  • 08:14 kartik@deploy1002: Finished scap: Backport for Make Western Frisian Wikipedia Machine Translation stricter by 10% (T323415) (duration: 10m 00s)
  • 08:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1027.eqiad.wmnet with OS bullseye
  • 08:04 kartik@deploy1002: kartik and kartik: Backport for Make Western Frisian Wikipedia Machine Translation stricter by 10% (T323415) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 08:04 kartik@deploy1002: Started scap: Backport for Make Western Frisian Wikipedia Machine Translation stricter by 10% (T323415)
  • 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti1027.eqiad.wmnet with reason: Remove from cluster for eventual reimage
  • 08:00 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti1027.eqiad.wmnet with reason: Remove from cluster for eventual reimage
  • 07:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 07:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 07:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2112.codfw.wmnet with reason: Maintenance
  • 07:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2112.codfw.wmnet with reason: Maintenance
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T321130)', diff saved to https://phabricator.wikimedia.org/P40743 and previous config saved to /var/cache/conftool/dbconfig/20221123-073714-marostegui.json
  • 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P40742 and previous config saved to /var/cache/conftool/dbconfig/20221123-072208-marostegui.json
  • 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P40741 and previous config saved to /var/cache/conftool/dbconfig/20221123-071246-root.json
  • 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P40740 and previous config saved to /var/cache/conftool/dbconfig/20221123-070659-marostegui.json
  • 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P40739 and previous config saved to /var/cache/conftool/dbconfig/20221123-065741-root.json
  • 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T321130)', diff saved to https://phabricator.wikimedia.org/P40738 and previous config saved to /var/cache/conftool/dbconfig/20221123-065153-marostegui.json
  • 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P40737 and previous config saved to /var/cache/conftool/dbconfig/20221123-064236-root.json
  • 06:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2176 (T321130)', diff saved to https://phabricator.wikimedia.org/P40736 and previous config saved to /var/cache/conftool/dbconfig/20221123-063932-marostegui.json
  • 06:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 06:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 06:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 06:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 06:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T321130)', diff saved to https://phabricator.wikimedia.org/P40735 and previous config saved to /var/cache/conftool/dbconfig/20221123-062905-marostegui.json
  • 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P40734 and previous config saved to /var/cache/conftool/dbconfig/20221123-062731-root.json
  • 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P40733 and previous config saved to /var/cache/conftool/dbconfig/20221123-061358-marostegui.json
  • 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P40732 and previous config saved to /var/cache/conftool/dbconfig/20221123-061226-root.json
  • 06:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1130.eqiad.wmnet with reason: Maintenance
  • 06:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1130.eqiad.wmnet with reason: Maintenance
  • 06:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 06:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 06:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 1%: After schema change', diff saved to https://phabricator.wikimedia.org/P40731 and previous config saved to /var/cache/conftool/dbconfig/20221123-060956-root.json
  • 06:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T321126)', diff saved to https://phabricator.wikimedia.org/P40730 and previous config saved to /var/cache/conftool/dbconfig/20221123-060500-marostegui.json
  • 06:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1185 (T321126)', diff saved to https://phabricator.wikimedia.org/P40729 and previous config saved to /var/cache/conftool/dbconfig/20221123-060228-marostegui.json
  • 06:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 06:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P40728 and previous config saved to /var/cache/conftool/dbconfig/20221123-055852-marostegui.json
  • 05:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T321130)', diff saved to https://phabricator.wikimedia.org/P40727 and previous config saved to /var/cache/conftool/dbconfig/20221123-054345-marostegui.json
  • 05:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 (T321130)', diff saved to https://phabricator.wikimedia.org/P40726 and previous config saved to /var/cache/conftool/dbconfig/20221123-053104-marostegui.json
  • 05:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 05:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 05:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T321130)', diff saved to https://phabricator.wikimedia.org/P40725 and previous config saved to /var/cache/conftool/dbconfig/20221123-053043-marostegui.json
  • 05:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P40724 and previous config saved to /var/cache/conftool/dbconfig/20221123-051536-marostegui.json
  • 05:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P40723 and previous config saved to /var/cache/conftool/dbconfig/20221123-050029-marostegui.json
  • 04:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T321130)', diff saved to https://phabricator.wikimedia.org/P40722 and previous config saved to /var/cache/conftool/dbconfig/20221123-044523-marostegui.json
  • 04:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 (T321130)', diff saved to https://phabricator.wikimedia.org/P40721 and previous config saved to /var/cache/conftool/dbconfig/20221123-043135-marostegui.json
  • 04:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 04:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 04:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T321130)', diff saved to https://phabricator.wikimedia.org/P40720 and previous config saved to /var/cache/conftool/dbconfig/20221123-043114-marostegui.json
  • 04:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P40719 and previous config saved to /var/cache/conftool/dbconfig/20221123-041607-marostegui.json
  • 04:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P40718 and previous config saved to /var/cache/conftool/dbconfig/20221123-040100-marostegui.json
  • 03:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T321130)', diff saved to https://phabricator.wikimedia.org/P40717 and previous config saved to /var/cache/conftool/dbconfig/20221123-034554-marostegui.json
  • 03:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2153 (T321130)', diff saved to https://phabricator.wikimedia.org/P40716 and previous config saved to /var/cache/conftool/dbconfig/20221123-033332-marostegui.json
  • 03:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 03:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 03:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T321130)', diff saved to https://phabricator.wikimedia.org/P40715 and previous config saved to /var/cache/conftool/dbconfig/20221123-033310-marostegui.json
  • 03:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P40714 and previous config saved to /var/cache/conftool/dbconfig/20221123-031804-marostegui.json
  • 03:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P40713 and previous config saved to /var/cache/conftool/dbconfig/20221123-030257-marostegui.json
  • 02:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T321130)', diff saved to https://phabricator.wikimedia.org/P40712 and previous config saved to /var/cache/conftool/dbconfig/20221123-024751-marostegui.json
  • 02:42 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2041.codfw.wmnet with OS bullseye
  • 02:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2146 (T321130)', diff saved to https://phabricator.wikimedia.org/P40711 and previous config saved to /var/cache/conftool/dbconfig/20221123-023453-marostegui.json
  • 02:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 02:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 02:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T321130)', diff saved to https://phabricator.wikimedia.org/P40710 and previous config saved to /var/cache/conftool/dbconfig/20221123-023431-marostegui.json
  • 02:30 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
  • 02:27 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
  • 02:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp2041']
  • 02:19 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
  • 02:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P40709 and previous config saved to /var/cache/conftool/dbconfig/20221123-021925-marostegui.json
  • 02:18 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
  • 02:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
  • 02:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
  • 02:14 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
  • 02:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P40708 and previous config saved to /var/cache/conftool/dbconfig/20221123-020418-marostegui.json
  • 01:55 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
  • 01:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T321130)', diff saved to https://phabricator.wikimedia.org/P40707 and previous config saved to /var/cache/conftool/dbconfig/20221123-014912-marostegui.json
  • 01:43 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2041.codfw.wmnet with OS bullseye
  • 01:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2145 (T321130)', diff saved to https://phabricator.wikimedia.org/P40706 and previous config saved to /var/cache/conftool/dbconfig/20221123-013627-marostegui.json
  • 01:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 01:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 01:29 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
  • 01:29 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2041.codfw.wmnet with OS bullseye
  • 01:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 01:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 01:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T321130)', diff saved to https://phabricator.wikimedia.org/P40705 and previous config saved to /var/cache/conftool/dbconfig/20221123-012524-marostegui.json
  • 01:16 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
  • 01:11 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2041.codfw.wmnet with OS bullseye
  • 01:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P40704 and previous config saved to /var/cache/conftool/dbconfig/20221123-011018-marostegui.json
  • 01:01 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
  • 01:00 sukhe: sudo rm /etc/dhcp/automation/ttyS1-115200/cp2041.conf
  • 00:59 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp2041.codfw.wmnet with OS bullseye
  • 00:59 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
  • 00:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P40703 and previous config saved to /var/cache/conftool/dbconfig/20221123-005511-marostegui.json
  • 00:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T321130)', diff saved to https://phabricator.wikimedia.org/P40702 and previous config saved to /var/cache/conftool/dbconfig/20221123-004005-marostegui.json
  • 00:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2130 (T321130)', diff saved to https://phabricator.wikimedia.org/P40701 and previous config saved to /var/cache/conftool/dbconfig/20221123-002716-marostegui.json
  • 00:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 00:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 00:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T321130)', diff saved to https://phabricator.wikimedia.org/P40700 and previous config saved to /var/cache/conftool/dbconfig/20221123-002654-marostegui.json
  • 00:14 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbprov1004.eqiad.wmnet with OS bullseye
  • 00:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P40699 and previous config saved to /var/cache/conftool/dbconfig/20221123-001147-marostegui.json

2022-11-22

  • 23:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P40698 and previous config saved to /var/cache/conftool/dbconfig/20221122-235641-marostegui.json
  • 23:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov1004.eqiad.wmnet with reason: host reimage
  • 23:50 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov1004.eqiad.wmnet with reason: host reimage
  • 23:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T321130)', diff saved to https://phabricator.wikimedia.org/P40697 and previous config saved to /var/cache/conftool/dbconfig/20221122-234134-marostegui.json
  • 23:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2116 (T321130)', diff saved to https://phabricator.wikimedia.org/P40696 and previous config saved to /var/cache/conftool/dbconfig/20221122-232903-marostegui.json
  • 23:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 23:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 23:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T321130)', diff saved to https://phabricator.wikimedia.org/P40695 and previous config saved to /var/cache/conftool/dbconfig/20221122-232841-marostegui.json
  • 23:16 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dbprov1004.eqiad.wmnet with OS bullseye
  • 23:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P40694 and previous config saved to /var/cache/conftool/dbconfig/20221122-231334-marostegui.json
  • 23:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetdb1003.eqiad.wmnet with OS bullseye
  • 22:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbprov1004']
  • 22:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P40693 and previous config saved to /var/cache/conftool/dbconfig/20221122-225828-marostegui.json
  • 22:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetdb1003.eqiad.wmnet with reason: host reimage
  • 22:48 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetdb1003.eqiad.wmnet with reason: host reimage
  • 22:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T321130)', diff saved to https://phabricator.wikimedia.org/P40692 and previous config saved to /var/cache/conftool/dbconfig/20221122-224321-marostegui.json
  • 22:38 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov1004']
  • 22:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dbprov1004']
  • 22:36 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host puppetdb1003.eqiad.wmnet with OS bullseye
  • 22:34 mutante: phabricator: on phab1001 user 'phd' is UID 497, on pahb1004 user 'phd' is UID 920 (this is desired and a fix!) - but also..because uid 497 was now free.. it became the UID of user 'vcs' on phab1004 while on phab1001 user 'vcs' is uid 498. so we use "find /srv/repos -uid 497 -exec chown phd {} \;" to give files owned by 497 to phd. T280597
  • 22:31 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov1004']
  • 22:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbprov1004']
  • 22:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2103 (T321130)', diff saved to https://phabricator.wikimedia.org/P40691 and previous config saved to /var/cache/conftool/dbconfig/20221122-223047-marostegui.json
  • 22:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 22:30 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov1004']
  • 22:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 22:24 mutante: temp disabling puppet on 17 hosts using rsync::quickdatacopy to carefully deploy gerrit:715636 allowing multiple dest hosts for syncing
  • 22:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 22:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 22:17 mutante: phab1004 - rsyncing /srv/repos from phab1001 with 2Mbit bwlimit - pulling - rsync -avp --bwlimit=2m --delete rsync://phab1001.eqiad.wmnet/srv-repos/ /srv/repos/ - T280597
  • 22:15 mutante: phab1004 - rsyncing /srv/repos from phab1001 with 2Mbit bwlimit
  • 22:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 22:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 21:59 TheresNoTime: close UTC late backport window
  • 21:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dbprov1004']
  • 21:58 samtar@deploy1002: Finished scap: Backport for Update TOC to use PinnableHeader (T317897) (duration: 06m 11s)
  • 21:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 21:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 21:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T321130)', diff saved to https://phabricator.wikimedia.org/P40690 and previous config saved to /var/cache/conftool/dbconfig/20221122-215610-marostegui.json
  • 21:52 samtar@deploy1002: samtar and jdlrobson: Backport for Update TOC to use PinnableHeader (T317897) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
  • 21:52 samtar@deploy1002: Started scap: Backport for Update TOC to use PinnableHeader (T317897)
  • 21:51 samtar@deploy1002: Finished scap: Backport for Fix icon button spacing in sticky header (T323176) (duration: 07m 25s)
  • 21:44 samtar@deploy1002: samtar and bwang: Backport for Fix icon button spacing in sticky header (T323176) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 21:44 samtar@deploy1002: Started scap: Backport for Fix icon button spacing in sticky header (T323176)
  • 21:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P40689 and previous config saved to /var/cache/conftool/dbconfig/20221122-214103-marostegui.json
  • 21:33 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov1004']
  • 21:32 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P40688 and previous config saved to /var/cache/conftool/dbconfig/20221122-212556-marostegui.json
  • 21:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T321130)', diff saved to https://phabricator.wikimedia.org/P40687 and previous config saved to /var/cache/conftool/dbconfig/20221122-211049-marostegui.json
  • 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['puppetdb1003']
  • 21:03 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:03 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:02 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:01 samtar@deploy1002: backport aborted: (duration: 00m 33s)
  • 20:58 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['puppetdb1003']
  • 20:57 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['puppetdb1003']
  • 20:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1196 (T321130)', diff saved to https://phabricator.wikimedia.org/P40686 and previous config saved to /var/cache/conftool/dbconfig/20221122-205720-marostegui.json
  • 20:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 20:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 20:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T321130)', diff saved to https://phabricator.wikimedia.org/P40685 and previous config saved to /var/cache/conftool/dbconfig/20221122-205659-marostegui.json
  • 20:48 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['puppetdb1003']
  • 20:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P40684 and previous config saved to /var/cache/conftool/dbconfig/20221122-204153-marostegui.json
  • 20:36 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host puppetdb1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P40683 and previous config saved to /var/cache/conftool/dbconfig/20221122-202646-marostegui.json
  • 20:23 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host puppetdb1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:21 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:19 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 20:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T321130)', diff saved to https://phabricator.wikimedia.org/P40682 and previous config saved to /var/cache/conftool/dbconfig/20221122-201140-marostegui.json
  • 20:07 sukhe: sudo ipmitool -I lanplus -H "cp2041.mgmt.codfw.wmnet" -U root -E chassis power cycle
  • 20:05 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
  • 20:05 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
  • 20:05 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
  • 20:04 brett@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2041.codfw.wmnet with OS bullseye
  • 20:04 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
  • 20:04 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
  • 20:04 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
  • 20:04 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041.codfw.wmnet']
  • 20:04 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041.codfw.wmnet']
  • 20:03 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041.codfw.wmnet']
  • 20:03 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041.codfw.wmnet']
  • 20:03 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
  • 20:03 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
  • 19:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1186 (T321130)', diff saved to https://phabricator.wikimedia.org/P40681 and previous config saved to /var/cache/conftool/dbconfig/20221122-195929-marostegui.json
  • 19:59 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041']
  • 19:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 19:59 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041']
  • 19:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 19:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T321130)', diff saved to https://phabricator.wikimedia.org/P40680 and previous config saved to /var/cache/conftool/dbconfig/20221122-195857-marostegui.json
  • 19:53 brett@cumin1001: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
  • 19:50 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041.codfw.wmnet']
  • 19:50 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041.codfw.wmnet']
  • 19:47 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041.codfw.wmnet']
  • 19:47 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041.codfw.wmnet']
  • 19:46 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041.codfw.wmnet']
  • 19:46 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041.codfw.wmnet']
  • 19:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P40679 and previous config saved to /var/cache/conftool/dbconfig/20221122-194350-marostegui.json
  • 19:42 sukhe@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp2041.codfw.wmnet']
  • 19:42 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp2041.codfw.wmnet']
  • 19:32 ejegg: payments-wiki upgraded from 67ec07a3 to ba31fd62
  • 19:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P40678 and previous config saved to /var/cache/conftool/dbconfig/20221122-192844-marostegui.json
  • 19:28 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2041.codfw.wmnet with OS bullseye
  • 19:24 sukhe: running homer for Gerrit 859600: lvs4006 decommission
  • 19:19 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts lvs4006.ulsfo.wmnet
  • 19:19 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:18 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
  • 19:17 sukhe@cumin2002: START - Cookbook sre.dns.netbox
  • 19:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T321130)', diff saved to https://phabricator.wikimedia.org/P40677 and previous config saved to /var/cache/conftool/dbconfig/20221122-191337-marostegui.json
  • 19:13 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs4006.ulsfo.wmnet
  • 19:00 ejegg: civicrm upgraded from ff512655 to fca1c8a6
  • 18:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T321130)', diff saved to https://phabricator.wikimedia.org/P40676 and previous config saved to /var/cache/conftool/dbconfig/20221122-185943-marostegui.json
  • 18:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 18:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 18:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T321130)', diff saved to https://phabricator.wikimedia.org/P40675 and previous config saved to /var/cache/conftool/dbconfig/20221122-185910-marostegui.json
  • 18:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T321126)', diff saved to https://phabricator.wikimedia.org/P40674 and previous config saved to /var/cache/conftool/dbconfig/20221122-184934-marostegui.json
  • 18:49 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs4006.ulsfo.wmnet with reason: downtimed, in the process of decom
  • 18:48 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on lvs4006.ulsfo.wmnet with reason: downtimed, in the process of decom
  • 18:48 sukhe: decommissioning lvs4006: T317247
  • 18:46 sukhe: cr[34]-ulsfo: set routing-options static route 198.35.26.112/28 next-hop 10.128.0.9: T317247
  • 18:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P40673 and previous config saved to /var/cache/conftool/dbconfig/20221122-184404-marostegui.json
  • 18:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P40672 and previous config saved to /var/cache/conftool/dbconfig/20221122-183428-marostegui.json
  • 18:34 brett@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp2041.codfw.wmnet with OS bullseye
  • 18:32 moritzm: installing pcre2 security updates
  • 18:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P40671 and previous config saved to /var/cache/conftool/dbconfig/20221122-182857-marostegui.json
  • 18:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P40670 and previous config saved to /var/cache/conftool/dbconfig/20221122-181919-marostegui.json
  • 18:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T321130)', diff saved to https://phabricator.wikimedia.org/P40669 and previous config saved to /var/cache/conftool/dbconfig/20221122-181351-marostegui.json
  • 18:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T321126)', diff saved to https://phabricator.wikimedia.org/P40668 and previous config saved to /var/cache/conftool/dbconfig/20221122-180412-marostegui.json
  • 18:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T321130)', diff saved to https://phabricator.wikimedia.org/P40667 and previous config saved to /var/cache/conftool/dbconfig/20221122-180109-marostegui.json
  • 18:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 18:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 18:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2178 (T321126)', diff saved to https://phabricator.wikimedia.org/P40666 and previous config saved to /var/cache/conftool/dbconfig/20221122-180049-marostegui.json
  • 18:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 18:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 18:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40665 and previous config saved to /var/cache/conftool/dbconfig/20221122-180038-marostegui.json
  • 17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P40664 and previous config saved to /var/cache/conftool/dbconfig/20221122-175750-ladsgroup.json
  • 17:56 btullis@cumin2002: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 17:55 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
  • 17:55 btullis@cumin1001: Added views for new wiki: igwikiquote T314639
  • 17:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 17:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 17:45 btullis@cumin2002: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 17:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P40663 and previous config saved to /var/cache/conftool/dbconfig/20221122-174532-marostegui.json
  • 17:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P40662 and previous config saved to /var/cache/conftool/dbconfig/20221122-174245-ladsgroup.json
  • 17:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 17:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 17:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T321130)', diff saved to https://phabricator.wikimedia.org/P40661 and previous config saved to /var/cache/conftool/dbconfig/20221122-173913-marostegui.json
  • 17:38 brett@cumin1001: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye
  • 17:30 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
  • 17:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P40660 and previous config saved to /var/cache/conftool/dbconfig/20221122-173025-marostegui.json
  • 17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P40659 and previous config saved to /var/cache/conftool/dbconfig/20221122-172740-ladsgroup.json
  • 17:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagetcd1006.eqiad.wmnet to plain
  • 17:25 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagetcd1006.eqiad.wmnet to plain
  • 17:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P40658 and previous config saved to /var/cache/conftool/dbconfig/20221122-172407-marostegui.json
  • 17:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagetcd1006.eqiad.wmnet to drbd
  • 17:17 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: apply config changes - bking@cumin2002 - T319020
  • 17:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40657 and previous config saved to /var/cache/conftool/dbconfig/20221122-171519-marostegui.json
  • 17:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P40656 and previous config saved to /var/cache/conftool/dbconfig/20221122-171235-ladsgroup.json
  • 17:12 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
  • 17:12 btullis@cumin1001: Added views for new wiki: bclwikiquote T316456
  • 17:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40655 and previous config saved to /var/cache/conftool/dbconfig/20221122-171151-marostegui.json
  • 17:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 17:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 17:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T321126)', diff saved to https://phabricator.wikimedia.org/P40654 and previous config saved to /var/cache/conftool/dbconfig/20221122-171141-marostegui.json
  • 17:09 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagetcd1006.eqiad.wmnet to drbd
  • 17:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P40653 and previous config saved to /var/cache/conftool/dbconfig/20221122-170900-marostegui.json
  • 16:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P40652 and previous config saved to /var/cache/conftool/dbconfig/20221122-165634-marostegui.json
  • 16:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T321130)', diff saved to https://phabricator.wikimedia.org/P40651 and previous config saved to /var/cache/conftool/dbconfig/20221122-165354-marostegui.json
  • 16:49 eevans@deploy1002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
  • 16:48 eevans@deploy1002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
  • 16:47 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
  • 16:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P40650 and previous config saved to /var/cache/conftool/dbconfig/20221122-164128-marostegui.json
  • 16:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T321130)', diff saved to https://phabricator.wikimedia.org/P40649 and previous config saved to /var/cache/conftool/dbconfig/20221122-164104-marostegui.json
  • 16:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 16:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 16:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T321130)', diff saved to https://phabricator.wikimedia.org/P40648 and previous config saved to /var/cache/conftool/dbconfig/20221122-164042-marostegui.json
  • 16:28 eevans@deploy1002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
  • 16:27 eevans@deploy1002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
  • 16:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T321126)', diff saved to https://phabricator.wikimedia.org/P40647 and previous config saved to /var/cache/conftool/dbconfig/20221122-162621-marostegui.json
  • 16:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P40646 and previous config saved to /var/cache/conftool/dbconfig/20221122-162536-marostegui.json
  • 16:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2157 (T321126)', diff saved to https://phabricator.wikimedia.org/P40645 and previous config saved to /var/cache/conftool/dbconfig/20221122-162257-marostegui.json
  • 16:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 16:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 16:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40644 and previous config saved to /var/cache/conftool/dbconfig/20221122-162247-marostegui.json
  • 16:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 16:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 16:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T322618)', diff saved to https://phabricator.wikimedia.org/P40643 and previous config saved to /var/cache/conftool/dbconfig/20221122-161542-ladsgroup.json
  • 16:11 eevans@deploy1002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
  • 16:10 eevans@deploy1002: helmfile [staging] START helmfile.d/services/sessionstore: apply
  • 16:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P40642 and previous config saved to /var/cache/conftool/dbconfig/20221122-161029-marostegui.json
  • 16:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P40641 and previous config saved to /var/cache/conftool/dbconfig/20221122-160740-marostegui.json
  • 16:02 moritzm: drain ganeti1027 for eventual reimage to Bullseye T311687
  • 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40640 and previous config saved to /var/cache/conftool/dbconfig/20221122-160036-ladsgroup.json
  • 15:59 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 15:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 15:57 claime: T323621 Add IPs for mw-web.svc and mw-api-ext.svc
  • 15:55 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
  • 15:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T321130)', diff saved to https://phabricator.wikimedia.org/P40639 and previous config saved to /var/cache/conftool/dbconfig/20221122-155523-marostegui.json
  • 15:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P40638 and previous config saved to /var/cache/conftool/dbconfig/20221122-155234-marostegui.json
  • 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40637 and previous config saved to /var/cache/conftool/dbconfig/20221122-154530-ladsgroup.json
  • 15:43 moritzm: importing php7.4 7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u1 to apt.wikimedia.org T323358
  • 15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T321130)', diff saved to https://phabricator.wikimedia.org/P40636 and previous config saved to /var/cache/conftool/dbconfig/20221122-154127-marostegui.json
  • 15:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 15:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 15:39 topranks: updating route-distinguisher for cloud vrf on cloud switches eqiad
  • 15:37 moritzm: upgrading mwdebug2002 to PHP 7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u1
  • 15:37 moritzm: upgrading mwdebug2002 to 7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u1
  • 15:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40635 and previous config saved to /var/cache/conftool/dbconfig/20221122-153727-marostegui.json
  • 15:34 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 15:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40634 and previous config saved to /var/cache/conftool/dbconfig/20221122-153403-marostegui.json
  • 15:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 15:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 15:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T321126)', diff saved to https://phabricator.wikimedia.org/P40633 and previous config saved to /var/cache/conftool/dbconfig/20221122-153352-marostegui.json
  • 15:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T322618)', diff saved to https://phabricator.wikimedia.org/P40632 and previous config saved to /var/cache/conftool/dbconfig/20221122-153235-ladsgroup.json
  • 15:31 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 15:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 15:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 15:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 (T321130)', diff saved to https://phabricator.wikimedia.org/P40631 and previous config saved to /var/cache/conftool/dbconfig/20221122-153038-marostegui.json
  • 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T322618)', diff saved to https://phabricator.wikimedia.org/P40630 and previous config saved to /var/cache/conftool/dbconfig/20221122-153023-ladsgroup.json
  • 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1202 (T322618)', diff saved to https://phabricator.wikimedia.org/P40629 and previous config saved to /var/cache/conftool/dbconfig/20221122-152813-ladsgroup.json
  • 15:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 15:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 15:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T322618)', diff saved to https://phabricator.wikimedia.org/P40628 and previous config saved to /var/cache/conftool/dbconfig/20221122-152751-ladsgroup.json
  • 15:27 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 15:25 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4009.ulsfo.wmnet with OS buster
  • 15:22 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 15:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P40627 and previous config saved to /var/cache/conftool/dbconfig/20221122-151846-marostegui.json
  • 15:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40626 and previous config saved to /var/cache/conftool/dbconfig/20221122-151728-ladsgroup.json
  • 15:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P40625 and previous config saved to /var/cache/conftool/dbconfig/20221122-151532-marostegui.json
  • 15:13 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40624 and previous config saved to /var/cache/conftool/dbconfig/20221122-151245-ladsgroup.json
  • 15:11 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:06 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4009.ulsfo.wmnet with reason: host reimage
  • 15:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P40623 and previous config saved to /var/cache/conftool/dbconfig/20221122-150339-marostegui.json
  • 15:03 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4009.ulsfo.wmnet with reason: host reimage
  • 15:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40622 and previous config saved to /var/cache/conftool/dbconfig/20221122-150221-ladsgroup.json
  • 15:00 oblivian@deploy1002: Finished scap: Adding clusterconfig (duration: 04m 17s)
  • 15:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P40621 and previous config saved to /var/cache/conftool/dbconfig/20221122-150025-marostegui.json
  • 14:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40620 and previous config saved to /var/cache/conftool/dbconfig/20221122-145738-ladsgroup.json
  • 14:56 oblivian@deploy1002: Started scap: Adding clusterconfig
  • 14:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 14:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 14:55 jnuche@deploy1002: Finished scap: testing k8s deploys (duration: 06m 08s)
  • 14:53 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
  • 14:53 btullis@cumin1001: Added views for new wiki: tlwikiquote T317111
  • 14:48 jnuche@deploy1002: Started scap: testing k8s deploys
  • 14:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T321126)', diff saved to https://phabricator.wikimedia.org/P40619 and previous config saved to /var/cache/conftool/dbconfig/20221122-144833-marostegui.json
  • 14:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T322618)', diff saved to https://phabricator.wikimedia.org/P40618 and previous config saved to /var/cache/conftool/dbconfig/20221122-144715-ladsgroup.json
  • 14:47 cmooney@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: Release v0.6.1 - cmooney@cumin1001
  • 14:45 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:45 cmooney@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: Release v0.6.1 - cmooney@cumin1001
  • 14:45 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 (T321130)', diff saved to https://phabricator.wikimedia.org/P40617 and previous config saved to /var/cache/conftool/dbconfig/20221122-144519-marostegui.json
  • 14:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2128 (T321126)', diff saved to https://phabricator.wikimedia.org/P40616 and previous config saved to /var/cache/conftool/dbconfig/20221122-144507-marostegui.json
  • 14:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 (T322618)', diff saved to https://phabricator.wikimedia.org/P40615 and previous config saved to /var/cache/conftool/dbconfig/20221122-144458-ladsgroup.json
  • 14:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 14:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 14:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 14:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 14:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 14:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T321126)', diff saved to https://phabricator.wikimedia.org/P40614 and previous config saved to /var/cache/conftool/dbconfig/20221122-144446-marostegui.json
  • 14:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40613 and previous config saved to /var/cache/conftool/dbconfig/20221122-144436-ladsgroup.json
  • 14:43 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: apply config changes - bking@cumin2002 - T319020
  • 14:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T322618)', diff saved to https://phabricator.wikimedia.org/P40612 and previous config saved to /var/cache/conftool/dbconfig/20221122-144232-ladsgroup.json
  • 14:41 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs4009.ulsfo.wmnet with OS buster
  • 14:41 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 14:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 14:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 14:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 14:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1194 (T322618)', diff saved to https://phabricator.wikimedia.org/P40611 and previous config saved to /var/cache/conftool/dbconfig/20221122-144023-ladsgroup.json
  • 14:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 14:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T322618)', diff saved to https://phabricator.wikimedia.org/P40610 and previous config saved to /var/cache/conftool/dbconfig/20221122-144002-ladsgroup.json
  • 14:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 14:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 14:39 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 14:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 14:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 14:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 14:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 14:35 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 14:34 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 14:33 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1132 (T321130)', diff saved to https://phabricator.wikimedia.org/P40609 and previous config saved to /var/cache/conftool/dbconfig/20221122-143224-marostegui.json
  • 14:32 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 14:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1132.eqiad.wmnet with reason: Maintenance
  • 14:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1132.eqiad.wmnet with reason: Maintenance
  • 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T321130)', diff saved to https://phabricator.wikimedia.org/P40608 and previous config saved to /var/cache/conftool/dbconfig/20221122-143203-marostegui.json
  • 14:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P40607 and previous config saved to /var/cache/conftool/dbconfig/20221122-142939-marostegui.json
  • 14:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P40606 and previous config saved to /var/cache/conftool/dbconfig/20221122-142930-ladsgroup.json
  • 14:28 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
  • 14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40605 and previous config saved to /var/cache/conftool/dbconfig/20221122-142455-ladsgroup.json
  • 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
  • 14:18 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P40604 and previous config saved to /var/cache/conftool/dbconfig/20221122-141656-marostegui.json
  • 14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P40603 and previous config saved to /var/cache/conftool/dbconfig/20221122-141433-marostegui.json
  • 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P40602 and previous config saved to /var/cache/conftool/dbconfig/20221122-141423-ladsgroup.json
  • 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
  • 14:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1004.eqiad.wmnet with reason: ganeti reboot
  • 14:12 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1004.eqiad.wmnet with reason: ganeti reboot
  • 14:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dse-k8s-etcd1001.eqiad.wmnet with reason: ganeti reboot
  • 14:12 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on dse-k8s-etcd1001.eqiad.wmnet with reason: ganeti reboot
  • 14:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on aux-k8s-etcd1003.eqiad.wmnet with reason: ganeti reboot
  • 14:11 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on aux-k8s-etcd1003.eqiad.wmnet with reason: ganeti reboot
  • 14:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40601 and previous config saved to /var/cache/conftool/dbconfig/20221122-140949-ladsgroup.json
  • 14:06 marostegui@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
  • 14:06 marostegui@cumin1001: Added views for new wiki: bnwikiquote T319190
  • 14:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P40600 and previous config saved to /var/cache/conftool/dbconfig/20221122-140150-marostegui.json
  • 13:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T321126)', diff saved to https://phabricator.wikimedia.org/P40599 and previous config saved to /var/cache/conftool/dbconfig/20221122-135926-marostegui.json
  • 13:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40598 and previous config saved to /var/cache/conftool/dbconfig/20221122-135917-ladsgroup.json
  • 13:57 vgutierrez: block plain text requests on icinga.wm.o - T238720
  • 13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40597 and previous config saved to /var/cache/conftool/dbconfig/20221122-135659-ladsgroup.json
  • 13:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 13:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 13:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40596 and previous config saved to /var/cache/conftool/dbconfig/20221122-135638-ladsgroup.json
  • 13:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2123 (T321126)', diff saved to https://phabricator.wikimedia.org/P40595 and previous config saved to /var/cache/conftool/dbconfig/20221122-135556-marostegui.json
  • 13:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 13:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 13:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T321126)', diff saved to https://phabricator.wikimedia.org/P40594 and previous config saved to /var/cache/conftool/dbconfig/20221122-135545-marostegui.json
  • 13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T322618)', diff saved to https://phabricator.wikimedia.org/P40593 and previous config saved to /var/cache/conftool/dbconfig/20221122-135442-ladsgroup.json
  • 13:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1191 (T322618)', diff saved to https://phabricator.wikimedia.org/P40592 and previous config saved to /var/cache/conftool/dbconfig/20221122-135233-ladsgroup.json
  • 13:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 13:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 13:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T322618)', diff saved to https://phabricator.wikimedia.org/P40591 and previous config saved to /var/cache/conftool/dbconfig/20221122-135211-ladsgroup.json
  • 13:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T321130)', diff saved to https://phabricator.wikimedia.org/P40590 and previous config saved to /var/cache/conftool/dbconfig/20221122-134643-marostegui.json
  • 13:43 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:42 jclark@cumin1001: START - Cookbook sre.dns.netbox
  • 13:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P40589 and previous config saved to /var/cache/conftool/dbconfig/20221122-134131-ladsgroup.json
  • 13:41 marostegui@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
  • 13:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P40588 and previous config saved to /var/cache/conftool/dbconfig/20221122-134038-marostegui.json
  • 13:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40587 and previous config saved to /var/cache/conftool/dbconfig/20221122-133705-ladsgroup.json
  • 13:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1128 (T321130)', diff saved to https://phabricator.wikimedia.org/P40586 and previous config saved to /var/cache/conftool/dbconfig/20221122-133401-marostegui.json
  • 13:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1128.eqiad.wmnet with reason: Maintenance
  • 13:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1128.eqiad.wmnet with reason: Maintenance
  • 13:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T321130)', diff saved to https://phabricator.wikimedia.org/P40585 and previous config saved to /var/cache/conftool/dbconfig/20221122-133339-marostegui.json
  • 13:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P40584 and previous config saved to /var/cache/conftool/dbconfig/20221122-132625-ladsgroup.json
  • 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P40583 and previous config saved to /var/cache/conftool/dbconfig/20221122-132532-marostegui.json
  • 13:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40582 and previous config saved to /var/cache/conftool/dbconfig/20221122-132158-ladsgroup.json
  • 13:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P40581 and previous config saved to /var/cache/conftool/dbconfig/20221122-131831-marostegui.json
  • 13:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40580 and previous config saved to /var/cache/conftool/dbconfig/20221122-131118-ladsgroup.json
  • 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T321126)', diff saved to https://phabricator.wikimedia.org/P40579 and previous config saved to /var/cache/conftool/dbconfig/20221122-131025-marostegui.json
  • 13:09 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
  • 13:09 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
  • 13:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40578 and previous config saved to /var/cache/conftool/dbconfig/20221122-130901-ladsgroup.json
  • 13:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 13:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 13:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T322618)', diff saved to https://phabricator.wikimedia.org/P40577 and previous config saved to /var/cache/conftool/dbconfig/20221122-130840-ladsgroup.json
  • 13:07 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1048.eqiad.wmnet with OS bullseye
  • 13:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2111 (T321126)', diff saved to https://phabricator.wikimedia.org/P40576 and previous config saved to /var/cache/conftool/dbconfig/20221122-130701-marostegui.json
  • 13:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 13:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 13:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T322618)', diff saved to https://phabricator.wikimedia.org/P40575 and previous config saved to /var/cache/conftool/dbconfig/20221122-130652-ladsgroup.json
  • 13:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 13:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 13:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 13:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T321126)', diff saved to https://phabricator.wikimedia.org/P40574 and previous config saved to /var/cache/conftool/dbconfig/20221122-130447-marostegui.json
  • 13:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T322618)', diff saved to https://phabricator.wikimedia.org/P40573 and previous config saved to /var/cache/conftool/dbconfig/20221122-130442-ladsgroup.json
  • 13:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 13:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 13:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 13:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 13:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40572 and previous config saved to /var/cache/conftool/dbconfig/20221122-130403-ladsgroup.json
  • 13:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P40571 and previous config saved to /var/cache/conftool/dbconfig/20221122-130325-marostegui.json
  • 12:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P40570 and previous config saved to /var/cache/conftool/dbconfig/20221122-125333-ladsgroup.json
  • 12:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P40569 and previous config saved to /var/cache/conftool/dbconfig/20221122-124941-marostegui.json
  • 12:49 jnuche@deploy1002: Finished scap: testing k8s deploys (duration: 06m 20s)
  • 12:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40568 and previous config saved to /var/cache/conftool/dbconfig/20221122-124856-ladsgroup.json
  • 12:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T321130)', diff saved to https://phabricator.wikimedia.org/P40567 and previous config saved to /var/cache/conftool/dbconfig/20221122-124818-marostegui.json
  • 12:43 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1048.eqiad.wmnet with reason: host reimage
  • 12:42 jnuche@deploy1002: Started scap: testing k8s deploys
  • 12:40 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1048.eqiad.wmnet with reason: host reimage
  • 12:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P40565 and previous config saved to /var/cache/conftool/dbconfig/20221122-123827-ladsgroup.json
  • 12:37 jnuche@deploy1002: Installation of scap version "4.29.1" completed for 559 hosts
  • 12:36 jnuche@deploy1002: Installing scap version "4.29.1" for 559 hosts
  • 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T321130)', diff saved to https://phabricator.wikimedia.org/P40564 and previous config saved to /var/cache/conftool/dbconfig/20221122-123505-marostegui.json
  • 12:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 12:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 (T321130)', diff saved to https://phabricator.wikimedia.org/P40563 and previous config saved to /var/cache/conftool/dbconfig/20221122-123444-marostegui.json
  • 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P40562 and previous config saved to /var/cache/conftool/dbconfig/20221122-123435-marostegui.json
  • 12:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40561 and previous config saved to /var/cache/conftool/dbconfig/20221122-123350-ladsgroup.json
  • 12:25 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1048.eqiad.wmnet with OS bullseye
  • 12:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T322618)', diff saved to https://phabricator.wikimedia.org/P40560 and previous config saved to /var/cache/conftool/dbconfig/20221122-122320-ladsgroup.json
  • 12:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2159 (T322618)', diff saved to https://phabricator.wikimedia.org/P40559 and previous config saved to /var/cache/conftool/dbconfig/20221122-122103-ladsgroup.json
  • 12:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 12:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 12:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 12:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 12:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T322618)', diff saved to https://phabricator.wikimedia.org/P40558 and previous config saved to /var/cache/conftool/dbconfig/20221122-122025-ladsgroup.json
  • 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P40557 and previous config saved to /var/cache/conftool/dbconfig/20221122-121938-marostegui.json
  • 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T321126)', diff saved to https://phabricator.wikimedia.org/P40556 and previous config saved to /var/cache/conftool/dbconfig/20221122-121928-marostegui.json
  • 12:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40555 and previous config saved to /var/cache/conftool/dbconfig/20221122-121843-ladsgroup.json
  • 12:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1200 (T321126)', diff saved to https://phabricator.wikimedia.org/P40554 and previous config saved to /var/cache/conftool/dbconfig/20221122-121657-marostegui.json
  • 12:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 12:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T321126)', diff saved to https://phabricator.wikimedia.org/P40553 and previous config saved to /var/cache/conftool/dbconfig/20221122-121647-marostegui.json
  • 12:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40552 and previous config saved to /var/cache/conftool/dbconfig/20221122-121633-ladsgroup.json
  • 12:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 12:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 12:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T322618)', diff saved to https://phabricator.wikimedia.org/P40551 and previous config saved to /var/cache/conftool/dbconfig/20221122-121612-ladsgroup.json
  • 12:14 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
  • 12:14 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
  • 12:10 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
  • 12:10 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
  • 12:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40550 and previous config saved to /var/cache/conftool/dbconfig/20221122-120519-ladsgroup.json
  • 12:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
  • 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P40549 and previous config saved to /var/cache/conftool/dbconfig/20221122-120431-marostegui.json
  • 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P40548 and previous config saved to /var/cache/conftool/dbconfig/20221122-120140-marostegui.json
  • 12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40547 and previous config saved to /var/cache/conftool/dbconfig/20221122-120106-ladsgroup.json
  • 11:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
  • 11:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd1005.eqiad.wmnet with reason: ganeti reboot
  • 11:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd1005.eqiad.wmnet with reason: ganeti reboot
  • 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on aux-k8s-etcd1001.eqiad.wmnet with reason: ganeti reboot
  • 11:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on aux-k8s-etcd1001.eqiad.wmnet with reason: ganeti reboot
  • 11:53 effie: MAPS maintenance EQIAD: trigger full planet re-import for maps eqiad
  • 11:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40546 and previous config saved to /var/cache/conftool/dbconfig/20221122-115012-ladsgroup.json
  • 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 (T321130)', diff saved to https://phabricator.wikimedia.org/P40545 and previous config saved to /var/cache/conftool/dbconfig/20221122-114925-marostegui.json
  • 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P40544 and previous config saved to /var/cache/conftool/dbconfig/20221122-114634-marostegui.json
  • 11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40543 and previous config saved to /var/cache/conftool/dbconfig/20221122-114559-ladsgroup.json
  • 11:44 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1049.eqiad.wmnet with OS bullseye
  • 11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1118 (T321130)', diff saved to https://phabricator.wikimedia.org/P40542 and previous config saved to /var/cache/conftool/dbconfig/20221122-113602-marostegui.json
  • 11:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1118.eqiad.wmnet with reason: Maintenance
  • 11:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1118.eqiad.wmnet with reason: Maintenance
  • 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107 (T321130)', diff saved to https://phabricator.wikimedia.org/P40541 and previous config saved to /var/cache/conftool/dbconfig/20221122-113541-marostegui.json
  • 11:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T322618)', diff saved to https://phabricator.wikimedia.org/P40540 and previous config saved to /var/cache/conftool/dbconfig/20221122-113506-ladsgroup.json
  • 11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 (T322618)', diff saved to https://phabricator.wikimedia.org/P40539 and previous config saved to /var/cache/conftool/dbconfig/20221122-113249-ladsgroup.json
  • 11:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 11:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T322618)', diff saved to https://phabricator.wikimedia.org/P40538 and previous config saved to /var/cache/conftool/dbconfig/20221122-113227-ladsgroup.json
  • 11:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T321126)', diff saved to https://phabricator.wikimedia.org/P40537 and previous config saved to /var/cache/conftool/dbconfig/20221122-113127-marostegui.json
  • 11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T321312)', diff saved to https://phabricator.wikimedia.org/P40536 and previous config saved to /var/cache/conftool/dbconfig/20221122-113053-ladsgroup.json
  • 11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T322618)', diff saved to https://phabricator.wikimedia.org/P40535 and previous config saved to /var/cache/conftool/dbconfig/20221122-113053-ladsgroup.json
  • 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T321126)', diff saved to https://phabricator.wikimedia.org/P40534 and previous config saved to /var/cache/conftool/dbconfig/20221122-112856-marostegui.json
  • 11:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 11:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 11:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 11:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T322618)', diff saved to https://phabricator.wikimedia.org/P40533 and previous config saved to /var/cache/conftool/dbconfig/20221122-112843-ladsgroup.json
  • 11:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 11:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 11:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 11:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 11:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
  • 11:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 11:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 11:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40532 and previous config saved to /var/cache/conftool/dbconfig/20221122-112137-marostegui.json
  • 11:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T322618)', diff saved to https://phabricator.wikimedia.org/P40531 and previous config saved to /var/cache/conftool/dbconfig/20221122-112131-ladsgroup.json
  • 11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P40530 and previous config saved to /var/cache/conftool/dbconfig/20221122-112034-marostegui.json
  • 11:18 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1049.eqiad.wmnet with reason: host reimage
  • 11:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40529 and previous config saved to /var/cache/conftool/dbconfig/20221122-111721-ladsgroup.json
  • 11:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
  • 11:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P40528 and previous config saved to /var/cache/conftool/dbconfig/20221122-111547-ladsgroup.json
  • 11:15 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1049.eqiad.wmnet with reason: host reimage
  • 11:10 moritzm: installing gnutls28 security updates
  • 11:07 stevemunene@deploy1002: Finished deploy [analytics/turnilo/deploy@51da050]: (no justification provided) (duration: 02m 12s)
  • 11:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P40527 and previous config saved to /var/cache/conftool/dbconfig/20221122-110631-marostegui.json
  • 11:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40526 and previous config saved to /var/cache/conftool/dbconfig/20221122-110625-ladsgroup.json
  • 11:05 stevemunene@deploy1002: Started deploy [analytics/turnilo/deploy@51da050]: (no justification provided)
  • 11:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P40525 and previous config saved to /var/cache/conftool/dbconfig/20221122-110528-marostegui.json
  • 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40524 and previous config saved to /var/cache/conftool/dbconfig/20221122-110214-ladsgroup.json
  • 11:01 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1049.eqiad.wmnet with OS bullseye
  • 11:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P40523 and previous config saved to /var/cache/conftool/dbconfig/20221122-110040-ladsgroup.json
  • 10:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
  • 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P40522 and previous config saved to /var/cache/conftool/dbconfig/20221122-105124-marostegui.json
  • 10:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
  • 10:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40521 and previous config saved to /var/cache/conftool/dbconfig/20221122-105118-ladsgroup.json
  • 10:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107 (T321130)', diff saved to https://phabricator.wikimedia.org/P40520 and previous config saved to /var/cache/conftool/dbconfig/20221122-105021-marostegui.json
  • 10:49 jnuche@deploy1002: Finished scap: testing k8s deploys (duration: 21m 06s)
  • 10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T322618)', diff saved to https://phabricator.wikimedia.org/P40519 and previous config saved to /var/cache/conftool/dbconfig/20221122-104708-ladsgroup.json
  • 10:46 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 10:46 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 10:45 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 10:45 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 10:45 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 10:45 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 10:45 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 10:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T321312)', diff saved to https://phabricator.wikimedia.org/P40518 and previous config saved to /var/cache/conftool/dbconfig/20221122-104534-ladsgroup.json
  • 10:45 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 10:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 (T322618)', diff saved to https://phabricator.wikimedia.org/P40517 and previous config saved to /var/cache/conftool/dbconfig/20221122-104451-ladsgroup.json
  • 10:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 10:45 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 10:45 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 10:45 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 10:45 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 10:45 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 10:44 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 10:44 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 10:44 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 10:44 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 10:44 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 10:44 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 10:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T322618)', diff saved to https://phabricator.wikimedia.org/P40516 and previous config saved to /var/cache/conftool/dbconfig/20221122-104429-ladsgroup.json
  • 10:44 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 10:44 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 10:44 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 10:44 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 10:44 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 10:44 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 10:43 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 10:43 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 10:43 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 10:43 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 10:43 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 10:43 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 10:43 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 10:39 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 10:38 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 10:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40515 and previous config saved to /var/cache/conftool/dbconfig/20221122-103618-marostegui.json
  • 10:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T322618)', diff saved to https://phabricator.wikimedia.org/P40514 and previous config saved to /var/cache/conftool/dbconfig/20221122-103612-ladsgroup.json
  • 10:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1107 (T321130)', diff saved to https://phabricator.wikimedia.org/P40513 and previous config saved to /var/cache/conftool/dbconfig/20221122-103544-marostegui.json
  • 10:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 10:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1107.eqiad.wmnet with reason: Maintenance
  • 10:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 10:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T323214)', diff saved to https://phabricator.wikimedia.org/P40512 and previous config saved to /var/cache/conftool/dbconfig/20221122-103527-ladsgroup.json
  • 10:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1107.eqiad.wmnet with reason: Maintenance
  • 10:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T321130)', diff saved to https://phabricator.wikimedia.org/P40511 and previous config saved to /var/cache/conftool/dbconfig/20221122-103522-marostegui.json
  • 10:34 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 10:34 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 10:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 (T322618)', diff saved to https://phabricator.wikimedia.org/P40510 and previous config saved to /var/cache/conftool/dbconfig/20221122-103402-ladsgroup.json
  • 10:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40509 and previous config saved to /var/cache/conftool/dbconfig/20221122-103346-marostegui.json
  • 10:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 10:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T322618)', diff saved to https://phabricator.wikimedia.org/P40508 and previous config saved to /var/cache/conftool/dbconfig/20221122-103341-ladsgroup.json
  • 10:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 10:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40507 and previous config saved to /var/cache/conftool/dbconfig/20221122-103336-marostegui.json
  • 10:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40506 and previous config saved to /var/cache/conftool/dbconfig/20221122-102923-ladsgroup.json
  • 10:28 jnuche@deploy1002: Started scap: testing k8s deploys
  • 10:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P40505 and previous config saved to /var/cache/conftool/dbconfig/20221122-102021-ladsgroup.json
  • 10:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P40504 and previous config saved to /var/cache/conftool/dbconfig/20221122-102016-marostegui.json
  • 10:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1026.eqiad.wmnet
  • 10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40503 and previous config saved to /var/cache/conftool/dbconfig/20221122-101834-ladsgroup.json
  • 10:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P40502 and previous config saved to /var/cache/conftool/dbconfig/20221122-101829-marostegui.json
  • 10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T321312)', diff saved to https://phabricator.wikimedia.org/P40501 and previous config saved to /var/cache/conftool/dbconfig/20221122-101620-ladsgroup.json
  • 10:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 10:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 10:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 10:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40500 and previous config saved to /var/cache/conftool/dbconfig/20221122-101417-ladsgroup.json
  • 10:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1026.eqiad.wmnet
  • 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on aux-k8s-etcd1002.eqiad.wmnet with reason: ganeti reboot
  • 10:12 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on aux-k8s-etcd1002.eqiad.wmnet with reason: ganeti reboot
  • 10:09 godog: start backfilling data into graphite2004 - T315524
  • 10:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P40499 and previous config saved to /var/cache/conftool/dbconfig/20221122-100515-ladsgroup.json
  • 10:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P40498 and previous config saved to /var/cache/conftool/dbconfig/20221122-100509-marostegui.json
  • 10:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40497 and previous config saved to /var/cache/conftool/dbconfig/20221122-100328-ladsgroup.json
  • 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P40496 and previous config saved to /var/cache/conftool/dbconfig/20221122-100323-marostegui.json
  • 09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T322618)', diff saved to https://phabricator.wikimedia.org/P40495 and previous config saved to /var/cache/conftool/dbconfig/20221122-095910-ladsgroup.json
  • 09:58 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1050.eqiad.wmnet with OS bullseye
  • 09:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2121 (T322618)', diff saved to https://phabricator.wikimedia.org/P40494 and previous config saved to /var/cache/conftool/dbconfig/20221122-095652-ladsgroup.json
  • 09:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 09:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 09:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T322618)', diff saved to https://phabricator.wikimedia.org/P40493 and previous config saved to /var/cache/conftool/dbconfig/20221122-095631-ladsgroup.json
  • 09:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1017.eqiad.wmnet
  • 09:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T323214)', diff saved to https://phabricator.wikimedia.org/P40492 and previous config saved to /var/cache/conftool/dbconfig/20221122-095008-ladsgroup.json
  • 09:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T321130)', diff saved to https://phabricator.wikimedia.org/P40491 and previous config saved to /var/cache/conftool/dbconfig/20221122-095003-marostegui.json
  • 09:49 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
  • 09:48 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/apertium: apply
  • 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T322618)', diff saved to https://phabricator.wikimedia.org/P40490 and previous config saved to /var/cache/conftool/dbconfig/20221122-094821-ladsgroup.json
  • 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40489 and previous config saved to /var/cache/conftool/dbconfig/20221122-094817-marostegui.json
  • 09:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1017.eqiad.wmnet
  • 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40488 and previous config saved to /var/cache/conftool/dbconfig/20221122-094645-marostegui.json
  • 09:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 09:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T321126)', diff saved to https://phabricator.wikimedia.org/P40487 and previous config saved to /var/cache/conftool/dbconfig/20221122-094635-marostegui.json
  • 09:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T322618)', diff saved to https://phabricator.wikimedia.org/P40486 and previous config saved to /var/cache/conftool/dbconfig/20221122-094611-ladsgroup.json
  • 09:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 09:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 09:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40485 and previous config saved to /var/cache/conftool/dbconfig/20221122-094550-ladsgroup.json
  • 09:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40484 and previous config saved to /var/cache/conftool/dbconfig/20221122-094125-ladsgroup.json
  • 09:36 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on graphite2004.codfw.wmnet with reason: setup
  • 09:36 filippo@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on graphite2004.codfw.wmnet with reason: setup
  • 09:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T321130)', diff saved to https://phabricator.wikimedia.org/P40483 and previous config saved to /var/cache/conftool/dbconfig/20221122-093556-marostegui.json
  • 09:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 09:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 09:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 09:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 09:31 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1050.eqiad.wmnet with reason: host reimage
  • 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P40482 and previous config saved to /var/cache/conftool/dbconfig/20221122-093128-marostegui.json
  • 09:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40481 and previous config saved to /var/cache/conftool/dbconfig/20221122-093044-ladsgroup.json
  • 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T321130)', diff saved to https://phabricator.wikimedia.org/P40480 and previous config saved to /var/cache/conftool/dbconfig/20221122-092845-marostegui.json
  • 09:27 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1050.eqiad.wmnet with reason: host reimage
  • 09:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40479 and previous config saved to /var/cache/conftool/dbconfig/20221122-092618-ladsgroup.json
  • 09:25 moritzm: failover Ganeti master in eqiad to ganeti1028 T311687
  • 09:24 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
  • 09:23 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/apertium: apply
  • 09:22 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
  • 09:22 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/apertium: apply
  • 09:19 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/apertium: apply
  • 09:18 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/apertium: apply
  • 09:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P40478 and previous config saved to /var/cache/conftool/dbconfig/20221122-091621-marostegui.json
  • 09:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40477 and previous config saved to /var/cache/conftool/dbconfig/20221122-091537-ladsgroup.json
  • 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P40476 and previous config saved to /var/cache/conftool/dbconfig/20221122-091339-marostegui.json
  • 09:13 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1050.eqiad.wmnet with OS bullseye
  • 09:11 jmm@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
  • 09:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T322618)', diff saved to https://phabricator.wikimedia.org/P40475 and previous config saved to /var/cache/conftool/dbconfig/20221122-091112-ladsgroup.json
  • 09:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2120 (T322618)', diff saved to https://phabricator.wikimedia.org/P40474 and previous config saved to /var/cache/conftool/dbconfig/20221122-090854-ladsgroup.json
  • 09:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 09:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 09:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T322618)', diff saved to https://phabricator.wikimedia.org/P40473 and previous config saved to /var/cache/conftool/dbconfig/20221122-090833-ladsgroup.json
  • 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T321126)', diff saved to https://phabricator.wikimedia.org/P40472 and previous config saved to /var/cache/conftool/dbconfig/20221122-090115-marostegui.json
  • 09:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40471 and previous config saved to /var/cache/conftool/dbconfig/20221122-090030-ladsgroup.json
  • 09:00 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
  • 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T321126)', diff saved to https://phabricator.wikimedia.org/P40470 and previous config saved to /var/cache/conftool/dbconfig/20221122-085843-marostegui.json
  • 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P40469 and previous config saved to /var/cache/conftool/dbconfig/20221122-085832-marostegui.json
  • 08:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1110.eqiad.wmnet with reason: Maintenance
  • 08:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1110.eqiad.wmnet with reason: Maintenance
  • 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T321126)', diff saved to https://phabricator.wikimedia.org/P40468 and previous config saved to /var/cache/conftool/dbconfig/20221122-085826-marostegui.json
  • 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40467 and previous config saved to /var/cache/conftool/dbconfig/20221122-085820-ladsgroup.json
  • 08:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 08:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40466 and previous config saved to /var/cache/conftool/dbconfig/20221122-085758-ladsgroup.json
  • 08:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40465 and previous config saved to /var/cache/conftool/dbconfig/20221122-085327-ladsgroup.json
  • 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T321130)', diff saved to https://phabricator.wikimedia.org/P40464 and previous config saved to /var/cache/conftool/dbconfig/20221122-084326-marostegui.json
  • 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P40463 and previous config saved to /var/cache/conftool/dbconfig/20221122-084320-marostegui.json
  • 08:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40462 and previous config saved to /var/cache/conftool/dbconfig/20221122-084252-ladsgroup.json
  • 08:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40461 and previous config saved to /var/cache/conftool/dbconfig/20221122-083820-ladsgroup.json
  • 08:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1200 (T323214)', diff saved to https://phabricator.wikimedia.org/P40460 and previous config saved to /var/cache/conftool/dbconfig/20221122-083003-ladsgroup.json
  • 08:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 08:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 08:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T323214)', diff saved to https://phabricator.wikimedia.org/P40459 and previous config saved to /var/cache/conftool/dbconfig/20221122-082920-ladsgroup.json
  • 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T321130)', diff saved to https://phabricator.wikimedia.org/P40458 and previous config saved to /var/cache/conftool/dbconfig/20221122-082904-marostegui.json
  • 08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 08:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T321130)', diff saved to https://phabricator.wikimedia.org/P40457 and previous config saved to /var/cache/conftool/dbconfig/20221122-082842-marostegui.json
  • 08:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P40456 and previous config saved to /var/cache/conftool/dbconfig/20221122-082813-marostegui.json
  • 08:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40455 and previous config saved to /var/cache/conftool/dbconfig/20221122-082746-ladsgroup.json
  • 08:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T322618)', diff saved to https://phabricator.wikimedia.org/P40454 and previous config saved to /var/cache/conftool/dbconfig/20221122-082314-ladsgroup.json
  • 08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2108 (T322618)', diff saved to https://phabricator.wikimedia.org/P40453 and previous config saved to /var/cache/conftool/dbconfig/20221122-082057-ladsgroup.json
  • 08:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 08:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 08:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 08:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 08:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 08:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 08:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P40452 and previous config saved to /var/cache/conftool/dbconfig/20221122-081413-ladsgroup.json
  • 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P40451 and previous config saved to /var/cache/conftool/dbconfig/20221122-081336-marostegui.json
  • 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T321126)', diff saved to https://phabricator.wikimedia.org/P40450 and previous config saved to /var/cache/conftool/dbconfig/20221122-081307-marostegui.json
  • 08:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40449 and previous config saved to /var/cache/conftool/dbconfig/20221122-081239-ladsgroup.json
  • 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T321126)', diff saved to https://phabricator.wikimedia.org/P40448 and previous config saved to /var/cache/conftool/dbconfig/20221122-081035-marostegui.json
  • 08:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T322618)', diff saved to https://phabricator.wikimedia.org/P40447 and previous config saved to /var/cache/conftool/dbconfig/20221122-081029-ladsgroup.json
  • 08:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 08:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40446 and previous config saved to /var/cache/conftool/dbconfig/20221122-081024-marostegui.json
  • 08:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 08:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 08:09 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply
  • 08:08 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply
  • 08:00 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply
  • 08:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T323214)', diff saved to https://phabricator.wikimedia.org/P40445 and previous config saved to /var/cache/conftool/dbconfig/20221122-080002-ladsgroup.json
  • 07:59 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply
  • 07:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P40444 and previous config saved to /var/cache/conftool/dbconfig/20221122-075907-ladsgroup.json
  • 07:58 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
  • 07:58 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
  • 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P40443 and previous config saved to /var/cache/conftool/dbconfig/20221122-075829-marostegui.json
  • 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P40442 and previous config saved to /var/cache/conftool/dbconfig/20221122-075518-marostegui.json
  • 07:54 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
  • 07:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P40441 and previous config saved to /var/cache/conftool/dbconfig/20221122-074455-ladsgroup.json
  • 07:44 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
  • 07:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T323214)', diff saved to https://phabricator.wikimedia.org/P40440 and previous config saved to /var/cache/conftool/dbconfig/20221122-074400-ladsgroup.json
  • 07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T321130)', diff saved to https://phabricator.wikimedia.org/P40439 and previous config saved to /var/cache/conftool/dbconfig/20221122-074323-marostegui.json
  • 07:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ml-etcd1002.eqiad.wmnet with reason: rack move of ganeti1012
  • 07:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ml-etcd1002.eqiad.wmnet with reason: rack move of ganeti1012
  • 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kubetcd1004.eqiad.wmnet with reason: rack move of ganeti1012
  • 07:40 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kubetcd1004.eqiad.wmnet with reason: rack move of ganeti1012
  • 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dse-k8s-etcd1003.eqiad.wmnet with reason: rack move of ganeti1012
  • 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P40438 and previous config saved to /var/cache/conftool/dbconfig/20221122-074011-marostegui.json
  • 07:40 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dse-k8s-etcd1003.eqiad.wmnet with reason: rack move of ganeti1012
  • 07:40 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
  • 07:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 07:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 07:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 07:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1122.eqiad.wmnet with reason: Maintenance
  • 07:30 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
  • 07:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P40437 and previous config saved to /var/cache/conftool/dbconfig/20221122-072949-ladsgroup.json
  • 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T321130)', diff saved to https://phabricator.wikimedia.org/P40436 and previous config saved to /var/cache/conftool/dbconfig/20221122-072918-marostegui.json
  • 07:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 07:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 07:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1122 T323116', diff saved to https://phabricator.wikimedia.org/P40435 and previous config saved to /var/cache/conftool/dbconfig/20221122-072802-ladsgroup.json
  • 07:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40434 and previous config saved to /var/cache/conftool/dbconfig/20221122-072505-marostegui.json
  • 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T321126)', diff saved to https://phabricator.wikimedia.org/P40433 and previous config saved to /var/cache/conftool/dbconfig/20221122-072233-marostegui.json
  • 07:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 07:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 07:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write T323116', diff saved to https://phabricator.wikimedia.org/P40432 and previous config saved to /var/cache/conftool/dbconfig/20221122-071759-ladsgroup.json
  • 07:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - T323116', diff saved to https://phabricator.wikimedia.org/P40431 and previous config saved to /var/cache/conftool/dbconfig/20221122-071727-ladsgroup.json
  • 07:17 Amir1: Starting s2 eqiad failover from db1122 to db1162 - T323116
  • 07:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T323214)', diff saved to https://phabricator.wikimedia.org/P40430 and previous config saved to /var/cache/conftool/dbconfig/20221122-071442-ladsgroup.json
  • 06:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1162 with weight 0 T323116', diff saved to https://phabricator.wikimedia.org/P40429 and previous config saved to /var/cache/conftool/dbconfig/20221122-065219-ladsgroup.json
  • 06:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s2 T323116
  • 06:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s2 T323116
  • 06:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1185 (T323214)', diff saved to https://phabricator.wikimedia.org/P40428 and previous config saved to /var/cache/conftool/dbconfig/20221122-064856-ladsgroup.json
  • 06:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 06:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 06:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T323214)', diff saved to https://phabricator.wikimedia.org/P40427 and previous config saved to /var/cache/conftool/dbconfig/20221122-064834-ladsgroup.json
  • 06:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P40426 and previous config saved to /var/cache/conftool/dbconfig/20221122-063328-ladsgroup.json
  • 06:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P40425 and previous config saved to /var/cache/conftool/dbconfig/20221122-061821-ladsgroup.json
  • 06:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T323214)', diff saved to https://phabricator.wikimedia.org/P40424 and previous config saved to /var/cache/conftool/dbconfig/20221122-060315-ladsgroup.json
  • 05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2178 (T323214)', diff saved to https://phabricator.wikimedia.org/P40423 and previous config saved to /var/cache/conftool/dbconfig/20221122-053947-ladsgroup.json
  • 05:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 05:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40422 and previous config saved to /var/cache/conftool/dbconfig/20221122-053925-ladsgroup.json
  • 05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P40421 and previous config saved to /var/cache/conftool/dbconfig/20221122-052419-ladsgroup.json
  • 05:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P40420 and previous config saved to /var/cache/conftool/dbconfig/20221122-050912-ladsgroup.json
  • 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40419 and previous config saved to /var/cache/conftool/dbconfig/20221122-045406-ladsgroup.json
  • 04:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T323214)', diff saved to https://phabricator.wikimedia.org/P40418 and previous config saved to /var/cache/conftool/dbconfig/20221122-040429-ladsgroup.json
  • 04:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 04:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 04:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 04:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40417 and previous config saved to /var/cache/conftool/dbconfig/20221122-025209-ladsgroup.json
  • 02:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 02:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 02:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T323214)', diff saved to https://phabricator.wikimedia.org/P40416 and previous config saved to /var/cache/conftool/dbconfig/20221122-025148-ladsgroup.json
  • 02:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P40415 and previous config saved to /var/cache/conftool/dbconfig/20221122-023641-ladsgroup.json
  • 02:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P40414 and previous config saved to /var/cache/conftool/dbconfig/20221122-022134-ladsgroup.json
  • 02:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T323214)', diff saved to https://phabricator.wikimedia.org/P40413 and previous config saved to /var/cache/conftool/dbconfig/20221122-020628-ladsgroup.json
  • 01:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 01:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 01:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40412 and previous config saved to /var/cache/conftool/dbconfig/20221122-015923-ladsgroup.json
  • 01:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on phab1004.eqiad.wmnet with reason: T322250
  • 01:56 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on phab1004.eqiad.wmnet with reason: T322250
  • 01:55 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for phab1001.eqiad.wmnet
  • 01:55 dzahn@cumin2002: START - Cookbook sre.hosts.remove-downtime for phab1001.eqiad.wmnet
  • 01:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221122-014417-ladsgroup.json
  • 01:29 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1004 -> phab1001 revert (duration: 00m 56s)
  • 01:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221122-012910-ladsgroup.json
  • 01:28 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1004 -> phab1001 revert
  • 01:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on phab1004.eqiad.wmnet with reason: T322250
  • 01:26 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on phab1004.eqiad.wmnet with reason: T322250
  • 01:25 brennen: reverting to phab1001; short phabricator downtime incoming while DNS changes are made (T280597)
  • 01:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40411 and previous config saved to /var/cache/conftool/dbconfig/20221122-011404-ladsgroup.json
  • 01:13 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on phab1004.eqiad.wmnet with reason: T322250
  • 01:13 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on phab1004.eqiad.wmnet with reason: T322250
  • 00:31 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on phab1001.eqiad.wmnet with reason: T322250
  • 00:31 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on phab1001.eqiad.wmnet with reason: T322250
  • 00:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 00:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 00:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T322618)', diff saved to https://phabricator.wikimedia.org/P40410 and previous config saved to /var/cache/conftool/dbconfig/20221122-002411-ladsgroup.json
  • 00:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 00:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 00:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T322618)', diff saved to https://phabricator.wikimedia.org/P40409 and previous config saved to /var/cache/conftool/dbconfig/20221122-002245-ladsgroup.json
  • 00:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P40408 and previous config saved to /var/cache/conftool/dbconfig/20221122-000904-ladsgroup.json
  • 00:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P40407 and previous config saved to /var/cache/conftool/dbconfig/20221122-000739-ladsgroup.json
  • 00:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2157 (T323214)', diff saved to https://phabricator.wikimedia.org/P40406 and previous config saved to /var/cache/conftool/dbconfig/20221122-000700-ladsgroup.json
  • 00:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 00:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 00:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40405 and previous config saved to /var/cache/conftool/dbconfig/20221122-000638-ladsgroup.json

2022-11-21

  • 23:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P40404 and previous config saved to /var/cache/conftool/dbconfig/20221121-235357-ladsgroup.json
  • 23:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P40403 and previous config saved to /var/cache/conftool/dbconfig/20221121-235232-ladsgroup.json
  • 23:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P40402 and previous config saved to /var/cache/conftool/dbconfig/20221121-235132-ladsgroup.json
  • 23:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T322618)', diff saved to https://phabricator.wikimedia.org/P40401 and previous config saved to /var/cache/conftool/dbconfig/20221121-233851-ladsgroup.json
  • 23:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T322618)', diff saved to https://phabricator.wikimedia.org/P40400 and previous config saved to /var/cache/conftool/dbconfig/20221121-233726-ladsgroup.json
  • 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 (T322618)', diff saved to https://phabricator.wikimedia.org/P40399 and previous config saved to /var/cache/conftool/dbconfig/20221121-233640-ladsgroup.json
  • 23:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P40398 and previous config saved to /var/cache/conftool/dbconfig/20221121-233625-ladsgroup.json
  • 23:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T322618)', diff saved to https://phabricator.wikimedia.org/P40397 and previous config saved to /var/cache/conftool/dbconfig/20221121-233619-ladsgroup.json
  • 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 (T322618)', diff saved to https://phabricator.wikimedia.org/P40396 and previous config saved to /var/cache/conftool/dbconfig/20221121-233331-ladsgroup.json
  • 23:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 23:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 23:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T322618)', diff saved to https://phabricator.wikimedia.org/P40395 and previous config saved to /var/cache/conftool/dbconfig/20221121-233309-ladsgroup.json
  • 23:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40394 and previous config saved to /var/cache/conftool/dbconfig/20221121-232119-ladsgroup.json
  • 23:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P40393 and previous config saved to /var/cache/conftool/dbconfig/20221121-232112-ladsgroup.json
  • 23:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P40392 and previous config saved to /var/cache/conftool/dbconfig/20221121-231803-ladsgroup.json
  • 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40391 and previous config saved to /var/cache/conftool/dbconfig/20221121-230659-ladsgroup.json
  • 23:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 23:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40390 and previous config saved to /var/cache/conftool/dbconfig/20221121-230638-ladsgroup.json
  • 23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P40389 and previous config saved to /var/cache/conftool/dbconfig/20221121-230606-ladsgroup.json
  • 23:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P40388 and previous config saved to /var/cache/conftool/dbconfig/20221121-230256-ladsgroup.json
  • 23:02 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - T319020
  • 22:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T322618)', diff saved to https://phabricator.wikimedia.org/P40387 and previous config saved to /var/cache/conftool/dbconfig/20221121-225724-ladsgroup.json
  • 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P40386 and previous config saved to /var/cache/conftool/dbconfig/20221121-225131-ladsgroup.json
  • 22:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T322618)', diff saved to https://phabricator.wikimedia.org/P40385 and previous config saved to /var/cache/conftool/dbconfig/20221121-225059-ladsgroup.json
  • 22:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T322618)', diff saved to https://phabricator.wikimedia.org/P40384 and previous config saved to /var/cache/conftool/dbconfig/20221121-224749-ladsgroup.json
  • 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 (T322618)', diff saved to https://phabricator.wikimedia.org/P40383 and previous config saved to /var/cache/conftool/dbconfig/20221121-224648-ladsgroup.json
  • 22:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 22:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T322618)', diff saved to https://phabricator.wikimedia.org/P40382 and previous config saved to /var/cache/conftool/dbconfig/20221121-224627-ladsgroup.json
  • 22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 (T322618)', diff saved to https://phabricator.wikimedia.org/P40381 and previous config saved to /var/cache/conftool/dbconfig/20221121-224355-ladsgroup.json
  • 22:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 22:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T322618)', diff saved to https://phabricator.wikimedia.org/P40380 and previous config saved to /var/cache/conftool/dbconfig/20221121-224322-ladsgroup.json
  • 22:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P40379 and previous config saved to /var/cache/conftool/dbconfig/20221121-224218-ladsgroup.json
  • 22:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T322618)', diff saved to https://phabricator.wikimedia.org/P40378 and previous config saved to /var/cache/conftool/dbconfig/20221121-224146-ladsgroup.json
  • 22:39 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1004 switch (duration: 00m 57s)
  • 22:38 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy config changes for phab1004 switch
  • 22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221121-223625-ladsgroup.json
  • 22:33 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - T319020
  • 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221121-223121-ladsgroup.json
  • 22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221121-222816-ladsgroup.json
  • 22:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221121-222711-ladsgroup.json
  • 22:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20221121-222640-ladsgroup.json
  • 22:23 mutante: stopping apache on phabricator machine - maintenance
  • 22:21 brennen: downtiming and disabling phab1001 in preparation for migration to phab1004 (T280597)
  • 22:21 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1001.eqiad.wmnet with reason: T280597
  • 22:21 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1001.eqiad.wmnet with reason: T280597
  • 22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40377 and previous config saved to /var/cache/conftool/dbconfig/20221121-222118-ladsgroup.json
  • 22:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P40376 and previous config saved to /var/cache/conftool/dbconfig/20221121-221614-ladsgroup.json
  • 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P40375 and previous config saved to /var/cache/conftool/dbconfig/20221121-221310-ladsgroup.json
  • 22:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T322618)', diff saved to https://phabricator.wikimedia.org/P40374 and previous config saved to /var/cache/conftool/dbconfig/20221121-221205-ladsgroup.json
  • 22:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P40373 and previous config saved to /var/cache/conftool/dbconfig/20221121-221134-ladsgroup.json
  • 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 (T322618)', diff saved to https://phabricator.wikimedia.org/P40372 and previous config saved to /var/cache/conftool/dbconfig/20221121-220415-ladsgroup.json
  • 22:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 22:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 22:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T322618)', diff saved to https://phabricator.wikimedia.org/P40371 and previous config saved to /var/cache/conftool/dbconfig/20221121-220343-ladsgroup.json
  • 22:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T322618)', diff saved to https://phabricator.wikimedia.org/P40370 and previous config saved to /var/cache/conftool/dbconfig/20221121-220107-ladsgroup.json
  • 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T322618)', diff saved to https://phabricator.wikimedia.org/P40369 and previous config saved to /var/cache/conftool/dbconfig/20221121-215857-ladsgroup.json
  • 21:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 21:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40368 and previous config saved to /var/cache/conftool/dbconfig/20221121-215835-ladsgroup.json
  • 21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T322618)', diff saved to https://phabricator.wikimedia.org/P40367 and previous config saved to /var/cache/conftool/dbconfig/20221121-215803-ladsgroup.json
  • 21:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T322618)', diff saved to https://phabricator.wikimedia.org/P40366 and previous config saved to /var/cache/conftool/dbconfig/20221121-215627-ladsgroup.json
  • 21:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T322618)', diff saved to https://phabricator.wikimedia.org/P40365 and previous config saved to /var/cache/conftool/dbconfig/20221121-215409-ladsgroup.json
  • 21:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 (T322618)', diff saved to https://phabricator.wikimedia.org/P40364 and previous config saved to /var/cache/conftool/dbconfig/20221121-215409-ladsgroup.json
  • 21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 21:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 21:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T322618)', diff saved to https://phabricator.wikimedia.org/P40363 and previous config saved to /var/cache/conftool/dbconfig/20221121-215348-ladsgroup.json
  • 21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40362 and previous config saved to /var/cache/conftool/dbconfig/20221121-215347-ladsgroup.json
  • 21:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P40361 and previous config saved to /var/cache/conftool/dbconfig/20221121-214836-ladsgroup.json
  • 21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P40360 and previous config saved to /var/cache/conftool/dbconfig/20221121-214329-ladsgroup.json
  • 21:42 TheresNoTime: close UTC late backport window
  • 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P40359 and previous config saved to /var/cache/conftool/dbconfig/20221121-213841-ladsgroup.json
  • 21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P40358 and previous config saved to /var/cache/conftool/dbconfig/20221121-213841-ladsgroup.json
  • 21:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:35 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P40357 and previous config saved to /var/cache/conftool/dbconfig/20221121-213330-ladsgroup.json
  • 21:31 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:31 samtar@deploy1002: Finished scap: Backport for Fix typo in tests/LoggingTest.php (duration: 04m 33s)
  • 21:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P40356 and previous config saved to /var/cache/conftool/dbconfig/20221121-212822-ladsgroup.json
  • 21:27 samtar@deploy1002: samtar and stang: Backport for Fix typo in tests/LoggingTest.php synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
  • 21:26 samtar@deploy1002: Started scap: Backport for Fix typo in tests/LoggingTest.php
  • 21:25 samtar@deploy1002: Finished scap: Backport for Fix no-JS Special:Notifications only displaying one notification per day (T323491) (duration: 05m 45s)
  • 21:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dbprov1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P40355 and previous config saved to /var/cache/conftool/dbconfig/20221121-212335-ladsgroup.json
  • 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P40354 and previous config saved to /var/cache/conftool/dbconfig/20221121-212334-ladsgroup.json
  • 21:21 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@00e5387]: incoming_links: Rename wiki to wikiid (duration: 02m 12s)
  • 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40353 and previous config saved to /var/cache/conftool/dbconfig/20221121-212055-ladsgroup.json
  • 21:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 21:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T323214)', diff saved to https://phabricator.wikimedia.org/P40352 and previous config saved to /var/cache/conftool/dbconfig/20221121-212033-ladsgroup.json
  • 21:19 samtar@deploy1002: samtar and matmarex: Backport for Fix no-JS Special:Notifications only displaying one notification per day (T323491) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
  • 21:19 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@00e5387]: incoming_links: Rename wiki to wikiid
  • 21:19 samtar@deploy1002: Started scap: Backport for Fix no-JS Special:Notifications only displaying one notification per day (T323491)
  • 21:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T322618)', diff saved to https://phabricator.wikimedia.org/P40351 and previous config saved to /var/cache/conftool/dbconfig/20221121-211823-ladsgroup.json
  • 21:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40350 and previous config saved to /var/cache/conftool/dbconfig/20221121-211316-ladsgroup.json
  • 21:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40349 and previous config saved to /var/cache/conftool/dbconfig/20221121-211105-ladsgroup.json
  • 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T322618)', diff saved to https://phabricator.wikimedia.org/P40348 and previous config saved to /var/cache/conftool/dbconfig/20221121-211033-ladsgroup.json
  • 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T322618)', diff saved to https://phabricator.wikimedia.org/P40347 and previous config saved to /var/cache/conftool/dbconfig/20221121-211008-ladsgroup.json
  • 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T322618)', diff saved to https://phabricator.wikimedia.org/P40346 and previous config saved to /var/cache/conftool/dbconfig/20221121-210828-ladsgroup.json
  • 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40345 and previous config saved to /var/cache/conftool/dbconfig/20221121-210828-ladsgroup.json
  • 21:08 samtar@deploy1002: Finished scap: Backport for Deploy Research Incentive survey on swwiki (T321252) (duration: 05m 32s)
  • 21:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40344 and previous config saved to /var/cache/conftool/dbconfig/20221121-210609-ladsgroup.json
  • 21:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 21:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T322618)', diff saved to https://phabricator.wikimedia.org/P40343 and previous config saved to /var/cache/conftool/dbconfig/20221121-210547-ladsgroup.json
  • 21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P40342 and previous config saved to /var/cache/conftool/dbconfig/20221121-210527-ladsgroup.json
  • 21:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T322618)', diff saved to https://phabricator.wikimedia.org/P40341 and previous config saved to /var/cache/conftool/dbconfig/20221121-210434-ladsgroup.json
  • 21:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 21:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 21:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T322618)', diff saved to https://phabricator.wikimedia.org/P40340 and previous config saved to /var/cache/conftool/dbconfig/20221121-210402-ladsgroup.json
  • 21:03 samtar@deploy1002: samtar and dani: Backport for Deploy Research Incentive survey on swwiki (T321252) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 21:02 samtar@deploy1002: Started scap: Backport for Deploy Research Incentive survey on swwiki (T321252)
  • 20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P40339 and previous config saved to /var/cache/conftool/dbconfig/20221121-205526-ladsgroup.json
  • 20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P40338 and previous config saved to /var/cache/conftool/dbconfig/20221121-205502-ladsgroup.json
  • 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P40337 and previous config saved to /var/cache/conftool/dbconfig/20221121-205041-ladsgroup.json
  • 20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P40336 and previous config saved to /var/cache/conftool/dbconfig/20221121-205019-ladsgroup.json
  • 20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40335 and previous config saved to /var/cache/conftool/dbconfig/20221121-204855-ladsgroup.json
  • 20:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P40334 and previous config saved to /var/cache/conftool/dbconfig/20221121-204020-ladsgroup.json
  • 20:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P40333 and previous config saved to /var/cache/conftool/dbconfig/20221121-203956-ladsgroup.json
  • 20:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P40332 and previous config saved to /var/cache/conftool/dbconfig/20221121-203534-ladsgroup.json
  • 20:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T323214)', diff saved to https://phabricator.wikimedia.org/P40331 and previous config saved to /var/cache/conftool/dbconfig/20221121-203513-ladsgroup.json
  • 20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40330 and previous config saved to /var/cache/conftool/dbconfig/20221121-203349-ladsgroup.json
  • 20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T322618)', diff saved to https://phabricator.wikimedia.org/P40329 and previous config saved to /var/cache/conftool/dbconfig/20221121-202513-ladsgroup.json
  • 20:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T322618)', diff saved to https://phabricator.wikimedia.org/P40328 and previous config saved to /var/cache/conftool/dbconfig/20221121-202449-ladsgroup.json
  • 20:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T322618)', diff saved to https://phabricator.wikimedia.org/P40327 and previous config saved to /var/cache/conftool/dbconfig/20221121-202303-ladsgroup.json
  • 20:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 20:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 20:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T322618)', diff saved to https://phabricator.wikimedia.org/P40326 and previous config saved to /var/cache/conftool/dbconfig/20221121-202242-ladsgroup.json
  • 20:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T322618)', diff saved to https://phabricator.wikimedia.org/P40325 and previous config saved to /var/cache/conftool/dbconfig/20221121-202027-ladsgroup.json
  • 20:19 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@48c230a]: transfer_to_es: Allow first run of wait_for_incoming_links (duration: 02m 14s)
  • 20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T322618)', diff saved to https://phabricator.wikimedia.org/P40324 and previous config saved to /var/cache/conftool/dbconfig/20221121-201842-ladsgroup.json
  • 20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 (T322618)', diff saved to https://phabricator.wikimedia.org/P40323 and previous config saved to /var/cache/conftool/dbconfig/20221121-201809-ladsgroup.json
  • 20:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 20:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40322 and previous config saved to /var/cache/conftool/dbconfig/20221121-201747-ladsgroup.json
  • 20:17 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@48c230a]: transfer_to_es: Allow first run of wait_for_incoming_links
  • 20:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 (T322618)', diff saved to https://phabricator.wikimedia.org/P40321 and previous config saved to /var/cache/conftool/dbconfig/20221121-201648-ladsgroup.json
  • 20:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 20:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40320 and previous config saved to /var/cache/conftool/dbconfig/20221121-201359-ladsgroup.json
  • 20:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 20:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 20:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T323214)', diff saved to https://phabricator.wikimedia.org/P40319 and previous config saved to /var/cache/conftool/dbconfig/20221121-201338-ladsgroup.json
  • 20:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 20:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T322618)', diff saved to https://phabricator.wikimedia.org/P40318 and previous config saved to /var/cache/conftool/dbconfig/20221121-201006-ladsgroup.json
  • 20:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P40317 and previous config saved to /var/cache/conftool/dbconfig/20221121-200735-ladsgroup.json
  • 20:06 brett@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5031.eqsin.wmnet with OS buster
  • 20:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P40316 and previous config saved to /var/cache/conftool/dbconfig/20221121-200238-ladsgroup.json
  • 19:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P40315 and previous config saved to /var/cache/conftool/dbconfig/20221121-195831-ladsgroup.json
  • 19:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P40314 and previous config saved to /var/cache/conftool/dbconfig/20221121-195459-ladsgroup.json
  • 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T322618)', diff saved to https://phabricator.wikimedia.org/P40313 and previous config saved to /var/cache/conftool/dbconfig/20221121-195244-ladsgroup.json
  • 19:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P40312 and previous config saved to /var/cache/conftool/dbconfig/20221121-195229-ladsgroup.json
  • 19:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T322618)', diff saved to https://phabricator.wikimedia.org/P40311 and previous config saved to /var/cache/conftool/dbconfig/20221121-195223-ladsgroup.json
  • 19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P40310 and previous config saved to /var/cache/conftool/dbconfig/20221121-194731-ladsgroup.json
  • 19:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P40309 and previous config saved to /var/cache/conftool/dbconfig/20221121-194324-ladsgroup.json
  • 19:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P40308 and previous config saved to /var/cache/conftool/dbconfig/20221121-193953-ladsgroup.json
  • 19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T322618)', diff saved to https://phabricator.wikimedia.org/P40307 and previous config saved to /var/cache/conftool/dbconfig/20221121-193722-ladsgroup.json
  • 19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P40306 and previous config saved to /var/cache/conftool/dbconfig/20221121-193717-ladsgroup.json
  • 19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T322618)', diff saved to https://phabricator.wikimedia.org/P40305 and previous config saved to /var/cache/conftool/dbconfig/20221121-193512-ladsgroup.json
  • 19:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 19:34 brett@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
  • 19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 19:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 19:34 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - T319020
  • 19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 19:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40304 and previous config saved to /var/cache/conftool/dbconfig/20221121-193225-ladsgroup.json
  • 19:31 brett@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
  • 19:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40303 and previous config saved to /var/cache/conftool/dbconfig/20221121-193006-ladsgroup.json
  • 19:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 19:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 19:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T322618)', diff saved to https://phabricator.wikimedia.org/P40302 and previous config saved to /var/cache/conftool/dbconfig/20221121-192933-ladsgroup.json
  • 19:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T323214)', diff saved to https://phabricator.wikimedia.org/P40301 and previous config saved to /var/cache/conftool/dbconfig/20221121-192818-ladsgroup.json
  • 19:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40300 and previous config saved to /var/cache/conftool/dbconfig/20221121-192729-ladsgroup.json
  • 19:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T322618)', diff saved to https://phabricator.wikimedia.org/P40299 and previous config saved to /var/cache/conftool/dbconfig/20221121-192446-ladsgroup.json
  • 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2128 (T323214)', diff saved to https://phabricator.wikimedia.org/P40298 and previous config saved to /var/cache/conftool/dbconfig/20221121-192246-ladsgroup.json
  • 19:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 19:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 19:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P40297 and previous config saved to /var/cache/conftool/dbconfig/20221121-192210-ladsgroup.json
  • 19:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T323214)', diff saved to https://phabricator.wikimedia.org/P40296 and previous config saved to /var/cache/conftool/dbconfig/20221121-192158-ladsgroup.json
  • 19:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 (T322618)', diff saved to https://phabricator.wikimedia.org/P40295 and previous config saved to /var/cache/conftool/dbconfig/20221121-191656-ladsgroup.json
  • 19:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 19:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 19:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T322618)', diff saved to https://phabricator.wikimedia.org/P40294 and previous config saved to /var/cache/conftool/dbconfig/20221121-191624-ladsgroup.json
  • 19:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P40293 and previous config saved to /var/cache/conftool/dbconfig/20221121-191427-ladsgroup.json
  • 19:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P40292 and previous config saved to /var/cache/conftool/dbconfig/20221121-191223-ladsgroup.json
  • 19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T322618)', diff saved to https://phabricator.wikimedia.org/P40291 and previous config saved to /var/cache/conftool/dbconfig/20221121-190702-ladsgroup.json
  • 19:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P40290 and previous config saved to /var/cache/conftool/dbconfig/20221121-190652-ladsgroup.json
  • 19:04 brett@cumin1001: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS buster
  • 19:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 (T322618)', diff saved to https://phabricator.wikimedia.org/P40289 and previous config saved to /var/cache/conftool/dbconfig/20221121-190306-ladsgroup.json
  • 19:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 19:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 19:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P40288 and previous config saved to /var/cache/conftool/dbconfig/20221121-190117-ladsgroup.json
  • 19:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 19:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 19:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T322618)', diff saved to https://phabricator.wikimedia.org/P40287 and previous config saved to /var/cache/conftool/dbconfig/20221121-190032-ladsgroup.json
  • 18:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P40286 and previous config saved to /var/cache/conftool/dbconfig/20221121-185920-ladsgroup.json
  • 18:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P40285 and previous config saved to /var/cache/conftool/dbconfig/20221121-185716-ladsgroup.json
  • 18:55 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P40284 and previous config saved to /var/cache/conftool/dbconfig/20221121-185145-ladsgroup.json
  • 18:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P40283 and previous config saved to /var/cache/conftool/dbconfig/20221121-184610-ladsgroup.json
  • 18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P40282 and previous config saved to /var/cache/conftool/dbconfig/20221121-184525-ladsgroup.json
  • 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T322618)', diff saved to https://phabricator.wikimedia.org/P40281 and previous config saved to /var/cache/conftool/dbconfig/20221121-184414-ladsgroup.json
  • 18:44 sukhe: reprepro -C component/dnsdist include bullseye-wikimedia dnsdist_1.7.2-1+wmf11u1_amd64.changes: T305589
  • 18:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40280 and previous config saved to /var/cache/conftool/dbconfig/20221121-184210-ladsgroup.json
  • 18:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T322618)', diff saved to https://phabricator.wikimedia.org/P40279 and previous config saved to /var/cache/conftool/dbconfig/20221121-184155-ladsgroup.json
  • 18:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 18:41 sukhe: remove dnsdist 1.7.2-1+wmf11u1 from apt.wm.o (bullseye, erroneously imported in main)
  • 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 18:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 18:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 18:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T322618)', diff saved to https://phabricator.wikimedia.org/P40278 and previous config saved to /var/cache/conftool/dbconfig/20221121-184107-ladsgroup.json
  • 18:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40277 and previous config saved to /var/cache/conftool/dbconfig/20221121-183959-ladsgroup.json
  • 18:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 18:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 18:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 18:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 18:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T322618)', diff saved to https://phabricator.wikimedia.org/P40276 and previous config saved to /var/cache/conftool/dbconfig/20221121-183919-ladsgroup.json
  • 18:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T323214)', diff saved to https://phabricator.wikimedia.org/P40275 and previous config saved to /var/cache/conftool/dbconfig/20221121-183639-ladsgroup.json
  • 18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T322618)', diff saved to https://phabricator.wikimedia.org/P40274 and previous config saved to /var/cache/conftool/dbconfig/20221121-183104-ladsgroup.json
  • 18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P40273 and previous config saved to /var/cache/conftool/dbconfig/20221121-183019-ladsgroup.json
  • 18:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-jumbo1010.eqiad.wmnet with OS bullseye
  • 18:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P40272 and previous config saved to /var/cache/conftool/dbconfig/20221121-182601-ladsgroup.json
  • 18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P40271 and previous config saved to /var/cache/conftool/dbconfig/20221121-182412-ladsgroup.json
  • 18:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 (T322618)', diff saved to https://phabricator.wikimedia.org/P40270 and previous config saved to /var/cache/conftool/dbconfig/20221121-182306-ladsgroup.json
  • 18:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 18:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 18:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:22 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T322618)', diff saved to https://phabricator.wikimedia.org/P40269 and previous config saved to /var/cache/conftool/dbconfig/20221121-181512-ladsgroup.json
  • 18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P40268 and previous config saved to /var/cache/conftool/dbconfig/20221121-181203-ladsgroup.json
  • 18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T322618)', diff saved to https://phabricator.wikimedia.org/P40267 and previous config saved to /var/cache/conftool/dbconfig/20221121-181116-ladsgroup.json
  • 18:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P40266 and previous config saved to /var/cache/conftool/dbconfig/20221121-181054-ladsgroup.json
  • 18:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 18:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
  • 18:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P40265 and previous config saved to /var/cache/conftool/dbconfig/20221121-180906-ladsgroup.json
  • 18:05 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 18:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 18:00 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - T319020
  • 17:59 bking@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - T319020
  • 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P40264 and previous config saved to /var/cache/conftool/dbconfig/20221121-175658-ladsgroup.json
  • 17:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T322618)', diff saved to https://phabricator.wikimedia.org/P40263 and previous config saved to /var/cache/conftool/dbconfig/20221121-175548-ladsgroup.json
  • 17:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T322618)', diff saved to https://phabricator.wikimedia.org/P40262 and previous config saved to /var/cache/conftool/dbconfig/20221121-175359-ladsgroup.json
  • 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 (T322618)', diff saved to https://phabricator.wikimedia.org/P40261 and previous config saved to /var/cache/conftool/dbconfig/20221121-175328-ladsgroup.json
  • 17:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T322618)', diff saved to https://phabricator.wikimedia.org/P40260 and previous config saved to /var/cache/conftool/dbconfig/20221121-175306-ladsgroup.json
  • 17:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T322618)', diff saved to https://phabricator.wikimedia.org/P40259 and previous config saved to /var/cache/conftool/dbconfig/20221121-175149-ladsgroup.json
  • 17:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 17:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
  • 17:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40258 and previous config saved to /var/cache/conftool/dbconfig/20221121-175127-ladsgroup.json
  • 17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P40257 and previous config saved to /var/cache/conftool/dbconfig/20221121-174153-ladsgroup.json
  • 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P40256 and previous config saved to /var/cache/conftool/dbconfig/20221121-173800-ladsgroup.json
  • 17:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P40255 and previous config saved to /var/cache/conftool/dbconfig/20221121-173621-ladsgroup.json
  • 17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2123 (T323214)', diff saved to https://phabricator.wikimedia.org/P40254 and previous config saved to /var/cache/conftool/dbconfig/20221121-173203-ladsgroup.json
  • 17:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 17:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 17:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T323214)', diff saved to https://phabricator.wikimedia.org/P40253 and previous config saved to /var/cache/conftool/dbconfig/20221121-173141-ladsgroup.json
  • 17:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-jumbo1010.eqiad.wmnet with OS bullseye
  • 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P40252 and previous config saved to /var/cache/conftool/dbconfig/20221121-172648-ladsgroup.json
  • 17:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T323214)', diff saved to https://phabricator.wikimedia.org/P40251 and previous config saved to /var/cache/conftool/dbconfig/20221121-172314-ladsgroup.json
  • 17:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
  • 17:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
  • 17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T323214)', diff saved to https://phabricator.wikimedia.org/P40250 and previous config saved to /var/cache/conftool/dbconfig/20221121-172253-ladsgroup.json
  • 17:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P40249 and previous config saved to /var/cache/conftool/dbconfig/20221121-172114-ladsgroup.json
  • 17:20 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs4009']
  • 17:19 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['lvs4010']
  • 17:19 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs4010']
  • 17:18 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['lvs4009']
  • 17:17 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
  • 17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P40248 and previous config saved to /var/cache/conftool/dbconfig/20221121-171635-ladsgroup.json
  • 17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 (T322618)', diff saved to https://phabricator.wikimedia.org/P40247 and previous config saved to /var/cache/conftool/dbconfig/20221121-171615-ladsgroup.json
  • 17:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 17:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 17:14 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
  • 17:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P40246 and previous config saved to /var/cache/conftool/dbconfig/20221121-170746-ladsgroup.json
  • 17:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40245 and previous config saved to /var/cache/conftool/dbconfig/20221121-170608-ladsgroup.json
  • 17:05 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
  • 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2104 (T322618)', diff saved to https://phabricator.wikimedia.org/P40244 and previous config saved to /var/cache/conftool/dbconfig/20221121-170529-ladsgroup.json
  • 17:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 17:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 17:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 17:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T322618)', diff saved to https://phabricator.wikimedia.org/P40243 and previous config saved to /var/cache/conftool/dbconfig/20221121-170357-ladsgroup.json
  • 17:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 17:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 17:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 17:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P40242 and previous config saved to /var/cache/conftool/dbconfig/20221121-170127-ladsgroup.json
  • 17:00 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
  • 17:00 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 03m 38s)
  • 16:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 16:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 16:56 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 03m 36s)
  • 16:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 16:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 16:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P40241 and previous config saved to /var/cache/conftool/dbconfig/20221121-165240-ladsgroup.json
  • 16:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 16:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
  • 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T323214)', diff saved to https://phabricator.wikimedia.org/P40240 and previous config saved to /var/cache/conftool/dbconfig/20221121-164620-ladsgroup.json
  • 16:43 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
  • 16:39 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4010.mgmt.ulsfo.wmnet with reboot policy FORCED
  • 16:38 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
  • 16:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T323214)', diff saved to https://phabricator.wikimedia.org/P40239 and previous config saved to /var/cache/conftool/dbconfig/20221121-163733-ladsgroup.json
  • 16:35 robh@cumin2002: START - Cookbook sre.hosts.provision for host lvs4009.mgmt.ulsfo.wmnet with reboot policy FORCED
  • 16:17 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS buster
  • 16:04 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1051.eqiad.wmnet with OS bullseye
  • 15:54 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php wikidatawiki --property-id P11136 --new-data-type string # T323470
  • 15:45 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
  • 15:42 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
  • 15:37 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage
  • 15:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T323214)', diff saved to https://phabricator.wikimedia.org/P40238 and previous config saved to /var/cache/conftool/dbconfig/20221121-153705-ladsgroup.json
  • 15:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 15:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
  • 15:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40237 and previous config saved to /var/cache/conftool/dbconfig/20221121-153611-ladsgroup.json
  • 15:33 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1051.eqiad.wmnet with reason: host reimage
  • 15:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2174.codfw.wmnet with reason: hw issues
  • 15:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2174.codfw.wmnet with reason: hw issues
  • 15:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P40236 and previous config saved to /var/cache/conftool/dbconfig/20221121-152105-ladsgroup.json
  • 15:19 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1051.eqiad.wmnet with OS bullseye
  • 15:16 urandom: initiating Cassandra bootstrap, aqs1018-a -- T307802
  • 15:15 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS buster
  • 15:15 jynus@cumin1001: dbctl commit (dc=all): 'Depool db2174 - crash?', diff saved to https://phabricator.wikimedia.org/P40235 and previous config saved to /var/cache/conftool/dbconfig/20221121-151501-jynus.json
  • 15:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P40234 and previous config saved to /var/cache/conftool/dbconfig/20221121-150558-ladsgroup.json
  • 14:54 btullis@cumin1001: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99)
  • 14:54 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
  • 14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40233 and previous config saved to /var/cache/conftool/dbconfig/20221121-145052-ladsgroup.json
  • 14:48 gehel: repooling elastic2052 - T320482
  • 14:48 gehel@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,name=elastic2052.codfw.wmnet
  • 14:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2111 (T323214)', diff saved to https://phabricator.wikimedia.org/P40232 and previous config saved to /var/cache/conftool/dbconfig/20221121-144234-ladsgroup.json
  • 14:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 14:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 14:40 godog: nuke old objectcache metrics from graphite hosts - T323357
  • 14:38 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: apply config changes - bking@cumin1001 - T319020
  • 14:34 urbanecm@deploy1002: Finished scap: Backport for SimpleParsoidOutputStash: use makeKey() (T323357) (duration: 07m 58s)
  • 14:26 urbanecm@deploy1002: urbanecm and daniel: Backport for SimpleParsoidOutputStash: use makeKey() (T323357) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 14:26 urbanecm@deploy1002: Started scap: Backport for SimpleParsoidOutputStash: use makeKey() (T323357)
  • 14:25 urbanecm@deploy1002: Finished scap: Backport for HookUtils::parseRevisionParsoidHtml doesn't need HTML for editing (T323357) (duration: 14m 06s)
  • 14:12 urbanecm@deploy1002: urbanecm and daniel: Backport for HookUtils::parseRevisionParsoidHtml doesn't need HTML for editing (T323357) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 14:11 urbanecm@deploy1002: Started scap: Backport for HookUtils::parseRevisionParsoidHtml doesn't need HTML for editing (T323357)
  • 14:10 urbanecm@deploy1002: Finished scap: Backport for Set parser cache write propability for /page/html endpoint. (duration: 04m 37s)
  • 14:05 urbanecm@deploy1002: Started scap: Backport for Set parser cache write propability for /page/html endpoint.
  • 14:04 urbanecm@deploy1002: backport aborted: (duration: 00m 51s)
  • 13:54 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ms-be2050.codfw.wmnet
  • 13:53 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS bullseye
  • 13:48 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2050.codfw.wmnet
  • 13:34 godog: there will a progressive roll restart of prometheus after https://gerrit.wikimedia.org/r/857522
  • 13:26 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
  • 13:24 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
  • 13:15 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
  • 13:14 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
  • 13:10 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS bullseye
  • 13:09 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/thumbor: sync
  • 13:09 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/thumbor: sync
  • 12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 12:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40231 and previous config saved to /var/cache/conftool/dbconfig/20221121-124146-ladsgroup.json
  • 12:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 12:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 12:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 12:15 jnuche@deploy1002: Installation of scap version "4.29.0" completed for 559 hosts
  • 12:14 jnuche@deploy1002: Installing scap version "4.29.0" for 559 hosts
  • 11:21 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host cloudvirt1053.eqiad.wmnet with OS bullseye
  • 10:54 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
  • 10:52 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
  • 10:48 btullis@cumin1001: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99)
  • 10:48 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
  • 10:38 aborrero@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1053.eqiad.wmnet with OS bullseye
  • 09:31 elukey@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 09:31 elukey@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 09:29 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
  • 09:28 elukey@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: sync
  • 09:15 elukey: restart ml-serve-codfw's kube-apiserver to clear some knative LIST certificate workload (still not sure what it is but it seems a bug related to our ancient version)
  • 08:31 urbanecm@deploy1002: Finished scap: Backport for GrowthExperiments: Enable unstarred mentorship filters at all wikis (T318457) (duration: 08m 04s)
  • 08:24 urbanecm@deploy1002: urbanecm and urbanecm: Backport for GrowthExperiments: Enable unstarred mentorship filters at all wikis (T318457) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
  • 08:23 urbanecm@deploy1002: Started scap: Backport for GrowthExperiments: Enable unstarred mentorship filters at all wikis (T318457)
  • 02:12 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5029.eqsin.wmnet with OS buster
  • 01:41 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
  • 01:37 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
  • 01:08 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster
  • 01:08 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5029.eqsin.wmnet with OS buster
  • 00:51 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster
  • 00:50 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5029.eqsin.wmnet with OS buster
  • 00:50 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster
  • 00:23 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster

2022-11-20

  • 20:29 urandom: initiating Cassandra bootstrap, aqs1020-b -- T307802
  • 19:16 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS buster
  • 18:47 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
  • 18:43 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
  • 18:14 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS buster

2022-11-19

  • 22:51 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS buster
  • 22:19 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
  • 22:15 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
  • 21:48 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS buster
  • 21:41 urandom: initiating Cassandra bootstrap, aqs1020-a -- T307802
  • 21:30 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5019.eqsin.wmnet with OS buster
  • 20:59 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
  • 20:56 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
  • 20:29 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS buster
  • 08:10 elukey: re-created knative pods misbehaving for ml-serve-codfw (causing latency alerts)
  • 02:01 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS buster
  • 01:28 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
  • 01:24 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
  • 00:56 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS buster
  • 00:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1013']
  • 00:23 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1013']
  • 00:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1013']
  • 00:02 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1013']

2022-11-18

  • 23:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 23:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 23:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T323214)', diff saved to https://phabricator.wikimedia.org/P40226 and previous config saved to /var/cache/conftool/dbconfig/20221118-235749-ladsgroup.json
  • 23:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1013.mgmt.eqiad.wmnet with reboot policy FORCED
  • 23:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T323214)', diff saved to https://phabricator.wikimedia.org/P40225 and previous config saved to /var/cache/conftool/dbconfig/20221118-235631-ladsgroup.json
  • 23:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40223 and previous config saved to /var/cache/conftool/dbconfig/20221118-234242-ladsgroup.json
  • 23:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40222 and previous config saved to /var/cache/conftool/dbconfig/20221118-234124-ladsgroup.json
  • 23:28 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1013.mgmt.eqiad.wmnet with reboot policy FORCED
  • 23:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P40221 and previous config saved to /var/cache/conftool/dbconfig/20221118-232736-ladsgroup.json
  • 23:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P40220 and previous config saved to /var/cache/conftool/dbconfig/20221118-232618-ladsgroup.json
  • 23:25 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 23:22 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:21 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 23:13 mutante: clouddumps1001 - manually ran /usr/local/bin/dump-fetch-phabdumps.sh and confirmed fetching works from new phab host phab1004 after gerrit:824805 T280597
  • 23:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T323214)', diff saved to https://phabricator.wikimedia.org/P40219 and previous config saved to /var/cache/conftool/dbconfig/20221118-231229-ladsgroup.json
  • 23:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T323214)', diff saved to https://phabricator.wikimedia.org/P40218 and previous config saved to /var/cache/conftool/dbconfig/20221118-231111-ladsgroup.json
  • 23:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1202 (T323214)', diff saved to https://phabricator.wikimedia.org/P40217 and previous config saved to /var/cache/conftool/dbconfig/20221118-230152-ladsgroup.json
  • 23:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 23:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 23:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T323214)', diff saved to https://phabricator.wikimedia.org/P40216 and previous config saved to /var/cache/conftool/dbconfig/20221118-230131-ladsgroup.json
  • 22:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 (T323214)', diff saved to https://phabricator.wikimedia.org/P40215 and previous config saved to /var/cache/conftool/dbconfig/20221118-225002-ladsgroup.json
  • 22:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 22:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 22:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40214 and previous config saved to /var/cache/conftool/dbconfig/20221118-224940-ladsgroup.json
  • 22:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40213 and previous config saved to /var/cache/conftool/dbconfig/20221118-224625-ladsgroup.json
  • 22:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P40212 and previous config saved to /var/cache/conftool/dbconfig/20221118-223434-ladsgroup.json
  • 22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P40211 and previous config saved to /var/cache/conftool/dbconfig/20221118-223118-ladsgroup.json
  • 22:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P40210 and previous config saved to /var/cache/conftool/dbconfig/20221118-221927-ladsgroup.json
  • 22:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T323214)', diff saved to https://phabricator.wikimedia.org/P40209 and previous config saved to /var/cache/conftool/dbconfig/20221118-221612-ladsgroup.json
  • 22:05 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5017.eqsin.wmnet with OS buster
  • 22:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1194 (T323214)', diff saved to https://phabricator.wikimedia.org/P40207 and previous config saved to /var/cache/conftool/dbconfig/20221118-220512-ladsgroup.json
  • 22:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 22:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T323214)', diff saved to https://phabricator.wikimedia.org/P40206 and previous config saved to /var/cache/conftool/dbconfig/20221118-220450-ladsgroup.json
  • 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40205 and previous config saved to /var/cache/conftool/dbconfig/20221118-220421-ladsgroup.json
  • 21:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40204 and previous config saved to /var/cache/conftool/dbconfig/20221118-214944-ladsgroup.json
  • 21:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40203 and previous config saved to /var/cache/conftool/dbconfig/20221118-214230-ladsgroup.json
  • 21:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 21:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 21:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40202 and previous config saved to /var/cache/conftool/dbconfig/20221118-214208-ladsgroup.json
  • 21:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P40201 and previous config saved to /var/cache/conftool/dbconfig/20221118-213437-ladsgroup.json
  • 21:32 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
  • 21:27 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
  • 21:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P40200 and previous config saved to /var/cache/conftool/dbconfig/20221118-212702-ladsgroup.json
  • 21:21 mutante: running phabricator task dump script on phab1004
  • 21:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T323214)', diff saved to https://phabricator.wikimedia.org/P40199 and previous config saved to /var/cache/conftool/dbconfig/20221118-211931-ladsgroup.json
  • 21:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1015']
  • 21:14 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1015']
  • 21:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P40198 and previous config saved to /var/cache/conftool/dbconfig/20221118-211155-ladsgroup.json
  • 21:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1015']
  • 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1191 (T323214)', diff saved to https://phabricator.wikimedia.org/P40197 and previous config saved to /var/cache/conftool/dbconfig/20221118-210825-ladsgroup.json
  • 21:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 21:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T323214)', diff saved to https://phabricator.wikimedia.org/P40196 and previous config saved to /var/cache/conftool/dbconfig/20221118-210804-ladsgroup.json
  • 20:56 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS buster
  • 20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40195 and previous config saved to /var/cache/conftool/dbconfig/20221118-205649-ladsgroup.json
  • 20:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40194 and previous config saved to /var/cache/conftool/dbconfig/20221118-205258-ladsgroup.json
  • 20:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P40193 and previous config saved to /var/cache/conftool/dbconfig/20221118-203751-ladsgroup.json
  • 20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40192 and previous config saved to /var/cache/conftool/dbconfig/20221118-203302-ladsgroup.json
  • 20:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 20:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T323214)', diff saved to https://phabricator.wikimedia.org/P40191 and previous config saved to /var/cache/conftool/dbconfig/20221118-203241-ladsgroup.json
  • 20:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T323214)', diff saved to https://phabricator.wikimedia.org/P40190 and previous config saved to /var/cache/conftool/dbconfig/20221118-202245-ladsgroup.json
  • 20:21 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1015']
  • 20:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1015.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P40189 and previous config saved to /var/cache/conftool/dbconfig/20221118-201734-ladsgroup.json
  • 20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T323214)', diff saved to https://phabricator.wikimedia.org/P40188 and previous config saved to /var/cache/conftool/dbconfig/20221118-201030-ladsgroup.json
  • 20:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 20:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 20:08 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5031']
  • 20:07 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cp5029']
  • 20:06 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5029']
  • 20:04 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5029']
  • 20:03 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5029']
  • 20:03 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5029']
  • 20:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P40187 and previous config saved to /var/cache/conftool/dbconfig/20221118-200228-ladsgroup.json
  • 19:59 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5031']
  • 19:58 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5030']
  • 19:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1012']
  • 19:58 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1012']
  • 19:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 19:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 19:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40186 and previous config saved to /var/cache/conftool/dbconfig/20221118-194859-ladsgroup.json
  • 19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T323214)', diff saved to https://phabricator.wikimedia.org/P40185 and previous config saved to /var/cache/conftool/dbconfig/20221118-194721-ladsgroup.json
  • 19:46 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5030']
  • 19:46 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1012']
  • 19:44 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1015.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:36 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5028']
  • 19:34 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1014']
  • 19:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40184 and previous config saved to /var/cache/conftool/dbconfig/20221118-193353-ladsgroup.json
  • 19:31 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5029']
  • 19:31 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5020']
  • 19:28 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1014']
  • 19:27 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1012']
  • 19:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2159 (T323214)', diff saved to https://phabricator.wikimedia.org/P40183 and previous config saved to /var/cache/conftool/dbconfig/20221118-192452-ladsgroup.json
  • 19:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 19:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 19:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 19:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 19:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T323214)', diff saved to https://phabricator.wikimedia.org/P40182 and previous config saved to /var/cache/conftool/dbconfig/20221118-192425-ladsgroup.json
  • 19:24 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1012']
  • 19:24 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5028']
  • 19:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1014']
  • 19:23 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5019']
  • 19:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P40181 and previous config saved to /var/cache/conftool/dbconfig/20221118-191846-ladsgroup.json
  • 19:18 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5020']
  • 19:15 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5018']
  • 19:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40180 and previous config saved to /var/cache/conftool/dbconfig/20221118-190919-ladsgroup.json
  • 19:07 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5019']
  • 19:07 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1014']
  • 19:06 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cp5017']
  • 19:05 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['kafka-jumbo1010']
  • 19:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
  • 19:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40179 and previous config saved to /var/cache/conftool/dbconfig/20221118-190340-ladsgroup.json
  • 19:03 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['kafka-jumbo1014']
  • 19:03 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1014']
  • 19:02 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5018']
  • 18:54 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5017']
  • 18:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P40178 and previous config saved to /var/cache/conftool/dbconfig/20221118-185412-ladsgroup.json
  • 18:52 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1012']
  • 18:51 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cp5017']
  • 18:51 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cp5017']
  • 18:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1012.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:45 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1014.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:43 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5031.mgmt.eqsin.wmnet with reboot policy FORCED
  • 18:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40177 and previous config saved to /var/cache/conftool/dbconfig/20221118-184258-ladsgroup.json
  • 18:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 18:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 18:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T323214)', diff saved to https://phabricator.wikimedia.org/P40176 and previous config saved to /var/cache/conftool/dbconfig/20221118-184236-ladsgroup.json
  • 18:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T323214)', diff saved to https://phabricator.wikimedia.org/P40175 and previous config saved to /var/cache/conftool/dbconfig/20221118-183906-ladsgroup.json
  • 18:32 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5031.mgmt.eqsin.wmnet with reboot policy FORCED
  • 18:31 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5030.mgmt.eqsin.wmnet with reboot policy FORCED
  • 18:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40174 and previous config saved to /var/cache/conftool/dbconfig/20221118-182730-ladsgroup.json
  • 18:21 herron: removed older exim logs to free space T305567
  • 18:20 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1014.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:19 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5030.mgmt.eqsin.wmnet with reboot policy FORCED
  • 18:18 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1012.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:18 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5029.mgmt.eqsin.wmnet with reboot policy FORCED
  • 18:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 (T323214)', diff saved to https://phabricator.wikimedia.org/P40173 and previous config saved to /var/cache/conftool/dbconfig/20221118-181741-ladsgroup.json
  • 18:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 18:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 18:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T323214)', diff saved to https://phabricator.wikimedia.org/P40172 and previous config saved to /var/cache/conftool/dbconfig/20221118-181720-ladsgroup.json
  • 18:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1011']
  • 18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P40171 and previous config saved to /var/cache/conftool/dbconfig/20221118-181223-ladsgroup.json
  • 18:06 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5029.mgmt.eqsin.wmnet with reboot policy FORCED
  • 18:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1011']
  • 18:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1010']
  • 18:03 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5028.mgmt.eqsin.wmnet with reboot policy FORCED
  • 18:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40170 and previous config saved to /var/cache/conftool/dbconfig/20221118-180212-ladsgroup.json
  • 17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T323214)', diff saved to https://phabricator.wikimedia.org/P40169 and previous config saved to /var/cache/conftool/dbconfig/20221118-175717-ladsgroup.json
  • 17:57 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
  • 17:56 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
  • 17:52 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5028.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:49 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5020.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P40168 and previous config saved to /var/cache/conftool/dbconfig/20221118-174702-ladsgroup.json
  • 17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T323214)', diff saved to https://phabricator.wikimedia.org/P40167 and previous config saved to /var/cache/conftool/dbconfig/20221118-174226-ladsgroup.json
  • 17:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 17:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 17:38 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5020.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:35 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5019.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T323214)', diff saved to https://phabricator.wikimedia.org/P40166 and previous config saved to /var/cache/conftool/dbconfig/20221118-173516-ladsgroup.json
  • 17:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T323214)', diff saved to https://phabricator.wikimedia.org/P40165 and previous config saved to /var/cache/conftool/dbconfig/20221118-173156-ladsgroup.json
  • 17:24 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5019.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:22 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5018.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40164 and previous config saved to /var/cache/conftool/dbconfig/20221118-172010-ladsgroup.json
  • 17:19 thcipriani@deploy1002: Finished scap: Backport for VE: Use instead of in CE HTML (T323343), Undo use of .reference instead of .mw-ref in CSS counter rules (T323343) (duration: 05m 58s)
  • 17:19 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
  • 17:19 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
  • 17:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
  • 17:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
  • 17:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
  • 17:13 thcipriani@deploy1002: thcipriani and matmarex: Backport for VE: Use instead of in CE HTML (T323343), Undo use of .reference instead of .mw-ref in CSS counter rules (T323343) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 17:13 thcipriani@deploy1002: Started scap: Backport for VE: Use instead of in CE HTML (T323343), Undo use of .reference instead of .mw-ref in CSS counter rules (T323343)
  • 17:12 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
  • 17:12 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5018.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:10 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
  • 17:09 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp5017.mgmt.eqsin.wmnet with reboot policy FORCED
  • 17:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 (T323214)', diff saved to https://phabricator.wikimedia.org/P40163 and previous config saved to /var/cache/conftool/dbconfig/20221118-170727-ladsgroup.json
  • 17:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 17:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 17:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T323214)', diff saved to https://phabricator.wikimedia.org/P40162 and previous config saved to /var/cache/conftool/dbconfig/20221118-170706-ladsgroup.json
  • 17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P40161 and previous config saved to /var/cache/conftool/dbconfig/20221118-170503-ladsgroup.json
  • 16:58 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp5017.mgmt.eqsin.wmnet with reboot policy FORCED
  • 16:58 claime: apple-search service decommissioned - T316296
  • 16:58 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5031
  • 16:58 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5031
  • 16:58 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5030
  • 16:55 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5030
  • 16:55 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5029
  • 16:55 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5029
  • 16:53 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5028
  • 16:53 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5028
  • 16:53 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5020
  • 16:52 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5020
  • 16:52 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5019
  • 16:52 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5019
  • 16:52 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5018
  • 16:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40160 and previous config saved to /var/cache/conftool/dbconfig/20221118-165200-ladsgroup.json
  • 16:51 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5018
  • 16:51 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-jumbo1010']
  • 16:51 robh@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5017
  • 16:50 robh@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5017
  • 16:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T323214)', diff saved to https://phabricator.wikimedia.org/P40159 and previous config saved to /var/cache/conftool/dbconfig/20221118-164957-ladsgroup.json
  • 16:49 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:49 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - T319020
  • 16:47 robh@cumin2002: START - Cookbook sre.dns.netbox
  • 16:45 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
  • 16:41 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1011']
  • 16:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 (T323214)', diff saved to https://phabricator.wikimedia.org/P40158 and previous config saved to /var/cache/conftool/dbconfig/20221118-163851-ladsgroup.json
  • 16:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 16:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 16:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T323214)', diff saved to https://phabricator.wikimedia.org/P40157 and previous config saved to /var/cache/conftool/dbconfig/20221118-163830-ladsgroup.json
  • 16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P40156 and previous config saved to /var/cache/conftool/dbconfig/20221118-163653-ladsgroup.json
  • 16:27 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1011']
  • 16:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1011.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40155 and previous config saved to /var/cache/conftool/dbconfig/20221118-162323-ladsgroup.json
  • 16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T323214)', diff saved to https://phabricator.wikimedia.org/P40154 and previous config saved to /var/cache/conftool/dbconfig/20221118-162147-ladsgroup.json
  • 16:18 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - bking@cumin1001 - T319020
  • 16:14 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 16:12 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 16:12 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 16:11 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 16:10 cgoubert@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:09 cgoubert@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 16:09 cgoubert@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:08 claime: removing apple-search namespaces - T316296
  • 16:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P40152 and previous config saved to /var/cache/conftool/dbconfig/20221118-160817-ladsgroup.json
  • 16:07 cgoubert@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 16:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2121 (T323214)', diff saved to https://phabricator.wikimedia.org/P40151 and previous config saved to /var/cache/conftool/dbconfig/20221118-160039-ladsgroup.json
  • 16:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 16:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T323214)', diff saved to https://phabricator.wikimedia.org/P40150 and previous config saved to /var/cache/conftool/dbconfig/20221118-160018-ladsgroup.json
  • 15:59 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: relforge restart - bking@cumin1001 - T319020
  • 15:55 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: relforge restart - bking@cumin1001 - T319020
  • 15:54 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1005.eqiad.wmnet with OS bullseye
  • 15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T323214)', diff saved to https://phabricator.wikimedia.org/P40149 and previous config saved to /var/cache/conftool/dbconfig/20221118-155310-ladsgroup.json
  • 15:52 bking@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: relforge restart - bking@cumin1001 - T319020
  • 15:52 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: relforge restart - bking@cumin1001 - T319020
  • 15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40148 and previous config saved to /var/cache/conftool/dbconfig/20221118-154511-ladsgroup.json
  • 15:42 ladsgroup@deploy1002: Finished scap: Backport for Don't add lede button if mobile DiscussionTools not enabled (T323341) (duration: 08m 47s)
  • 15:40 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1011.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:40 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
  • 15:36 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
  • 15:34 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for Don't add lede button if mobile DiscussionTools not enabled (T323341) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 15:33 ladsgroup@deploy1002: Started scap: Backport for Don't add lede button if mobile DiscussionTools not enabled (T323341)
  • 15:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P40147 and previous config saved to /var/cache/conftool/dbconfig/20221118-153005-ladsgroup.json
  • 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T323214)', diff saved to https://phabricator.wikimedia.org/P40146 and previous config saved to /var/cache/conftool/dbconfig/20221118-152820-ladsgroup.json
  • 15:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 15:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
  • 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40145 and previous config saved to /var/cache/conftool/dbconfig/20221118-152758-ladsgroup.json
  • 15:24 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-logging1005.eqiad.wmnet with OS bullseye
  • 15:18 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T323214)', diff saved to https://phabricator.wikimedia.org/P40144 and previous config saved to /var/cache/conftool/dbconfig/20221118-151458-ladsgroup.json
  • 15:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40143 and previous config saved to /var/cache/conftool/dbconfig/20221118-151252-ladsgroup.json
  • 15:10 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:08 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
  • 14:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P40142 and previous config saved to /var/cache/conftool/dbconfig/20221118-145746-ladsgroup.json
  • 14:54 moritzm: installing node-minimist security updates
  • 14:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2120 (T323214)', diff saved to https://phabricator.wikimedia.org/P40141 and previous config saved to /var/cache/conftool/dbconfig/20221118-145330-ladsgroup.json
  • 14:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 14:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 14:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T323214)', diff saved to https://phabricator.wikimedia.org/P40140 and previous config saved to /var/cache/conftool/dbconfig/20221118-145308-ladsgroup.json
  • 14:45 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2003-dev.codfw.wmnet with reason: host reimage
  • 14:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40139 and previous config saved to /var/cache/conftool/dbconfig/20221118-144239-ladsgroup.json
  • 14:41 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2003-dev.codfw.wmnet with reason: host reimage
  • 14:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40138 and previous config saved to /var/cache/conftool/dbconfig/20221118-143802-ladsgroup.json
  • 14:30 urandom: initiating Cassandra bootstrap, aqs1017-b -- T307802
  • 14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T318605)', diff saved to https://phabricator.wikimedia.org/P40137 and previous config saved to /var/cache/conftool/dbconfig/20221118-142854-ladsgroup.json
  • 14:25 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
  • 14:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P40136 and previous config saved to /var/cache/conftool/dbconfig/20221118-142255-ladsgroup.json
  • 14:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40135 and previous config saved to /var/cache/conftool/dbconfig/20221118-141744-ladsgroup.json
  • 14:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 14:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1101.eqiad.wmnet with reason: Maintenance
  • 14:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40134 and previous config saved to /var/cache/conftool/dbconfig/20221118-141722-ladsgroup.json
  • 14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40133 and previous config saved to /var/cache/conftool/dbconfig/20221118-141347-ladsgroup.json
  • 14:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T323214)', diff saved to https://phabricator.wikimedia.org/P40132 and previous config saved to /var/cache/conftool/dbconfig/20221118-140749-ladsgroup.json
  • 14:04 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
  • 14:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40131 and previous config saved to /var/cache/conftool/dbconfig/20221118-140216-ladsgroup.json
  • 13:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40130 and previous config saved to /var/cache/conftool/dbconfig/20221118-135841-ladsgroup.json
  • 13:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P40129 and previous config saved to /var/cache/conftool/dbconfig/20221118-134709-ladsgroup.json
  • 13:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2108 (T323214)', diff saved to https://phabricator.wikimedia.org/P40128 and previous config saved to /var/cache/conftool/dbconfig/20221118-134633-ladsgroup.json
  • 13:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 13:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T318605)', diff saved to https://phabricator.wikimedia.org/P40127 and previous config saved to /var/cache/conftool/dbconfig/20221118-134334-ladsgroup.json
  • 13:35 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2002-dev.codfw.wmnet with reason: host reimage
  • 13:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40126 and previous config saved to /var/cache/conftool/dbconfig/20221118-133203-ladsgroup.json
  • 13:31 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2002-dev.codfw.wmnet with reason: host reimage
  • 13:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 13:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 13:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T318605)', diff saved to https://phabricator.wikimedia.org/P40125 and previous config saved to /var/cache/conftool/dbconfig/20221118-132141-ladsgroup.json
  • 13:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 13:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 13:14 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
  • 13:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 13:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T323214)', diff saved to https://phabricator.wikimedia.org/P40124 and previous config saved to /var/cache/conftool/dbconfig/20221118-130829-ladsgroup.json
  • 13:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 13:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 13:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 12:46 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
  • 12:45 claime: cgoubert@deploy1002:/apple-search$ helmfile -e codfw -i destroy - T316296
  • 12:45 claime: cgoubert@deploy1002:/apple-search$ helmfile -e eqiad -i destroy - T316296
  • 12:43 claime: cgoubert@deploy1002:/apple-search$ helmfile -e staging -i destroy - T316296
  • 12:41 claime: Starting apple-search removal from wikikube - T316296
  • 12:37 claime: Removing apple-search from conftool - T316296
  • 12:30 claime: Removing apple-search from service::catalog - T316296
  • 12:26 claime: cgoubert@authdns1001:~$ sudo -i authdns-update
  • 12:26 claime: Clean up apple-search DNS - T316296
  • 12:22 claime: apple-search removed from backends - T316296
  • 12:21 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt2001-dev.codfw.wmnet with reason: host reimage
  • 12:18 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2001-dev.codfw.wmnet with reason: host reimage
  • 12:17 oblivian@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on D{lvs2009.codfw.wmnet,lvs1019.eqiad.wmnet} and A:lvs
  • 12:17 claime: cgoubert@lvs1019:~$ sudo ipvsadm --delete-service --tcp-service 10.2.2.68:4013
  • 12:12 claime: cgoubert@lvs2009:~$ sudo ipvsadm --delete-service --tcp-service 10.2.1.68:4013
  • 12:10 oblivian@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on D{lvs2009.codfw.wmnet,lvs1019.eqiad.wmnet} and A:lvs
  • 12:09 oblivian@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on D{lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet} and A:lvs
  • 12:08 claime: cgoubert@lvs1020:~$ sudo ipvsadm --delete-service --tcp-service 10.2.2.68:4013
  • 12:06 claime: cgoubert@lvs2010:~$ sudo ipvsadm --delete-service --tcp-service 10.2.1.68:4013
  • 12:02 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
  • 12:01 moritzm: installing libgoogle-gson-java security updates
  • 12:01 oblivian@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on D{lvs2010.codfw.wmnet,lvs1020.eqiad.wmnet} and A:lvs
  • 11:53 claime: Switching apple-search to state:service_setup - T316296
  • 11:41 claime: Switching apple-search to state:lvs_setup - T316296
  • 11:34 claime: Running authdns-update - T316296
  • 11:31 moritzm: installing Linux 4.19.260 on Buster systems
  • 11:27 claime: Starting decommission of apple-search service - T316296
  • 10:34 moritzm: draining ganeti1012 in preparation of server move to a new rack T308339
  • 10:18 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 10:18 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 10:13 moritzm: installing sysstat security updates
  • 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5001.eqsin.wmnet
  • 10:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5001.eqsin.wmnet
  • 09:57 oblivian@deploy1002: Finished scap: Backport for Don't run OutputPageBeforeHTML for the talkpageheader (T316175) (duration: 05m 29s)
  • 09:52 oblivian@deploy1002: oblivian and matmarex: Backport for Don't run OutputPageBeforeHTML for the talkpageheader (T316175) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 09:52 oblivian@deploy1002: Started scap: Backport for Don't run OutputPageBeforeHTML for the talkpageheader (T316175)
  • 09:51 filippo@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
  • 09:49 filippo@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync-mgmt - filippo@cumin1001"
  • 09:37 moritzm: installing ncurses security updates
  • 09:21 godog: nuke MediaWiki.objectcache.*_11ed_* - T323357
  • 09:16 elukey: push the 'k8s_116' tag for docker-registry.discovery.wmnet/pause - T322920
  • 09:08 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1019.eqiad.wmnet to cluster eqiad and group D
  • 09:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1019.eqiad.wmnet to cluster eqiad and group D
  • 08:46 moritzm: failover ganeti master in eqsin to ganeti5003
  • 08:41 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 45102
  • 08:41 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 45102
  • 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5003.eqsin.wmnet
  • 08:37 XioNoX: shutdown SV8 port - T321323
  • 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1019.eqiad.wmnet
  • 08:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5003.eqsin.wmnet
  • 08:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1019.eqiad.wmnet
  • 07:24 XioNoX: decom all Equinix SV8 BGP sessions - T321323
  • 04:45 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-jumbo1010']
  • 04:28 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-jumbo1010']
  • 04:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 04:01 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-jumbo1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 03:56 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 03:54 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 02:45 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging1005']
  • 02:36 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
  • 01:56 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging1005']
  • 01:46 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
  • 01:46 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1005']
  • 01:39 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
  • 01:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1005']
  • 01:37 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
  • 01:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging1005']
  • 01:34 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
  • 01:26 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db2173']
  • 01:25 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2173']
  • 01:21 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1005']
  • 01:20 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
  • 01:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging1005']
  • 01:04 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
  • 01:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1005']
  • 00:51 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
  • 00:47 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1005']
  • 00:40 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1005']
  • 00:10 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)

2022-11-17

  • 23:05 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
  • 22:50 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.40.0-wmf.10 refs T320515
  • 22:48 bking@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
  • 22:46 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
  • 22:41 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
  • 22:41 brennen@deploy1002: Finished scap: Backport for MediaWiki: Temp silence FR-induced clearActionName warnings (T323254) (duration: 07m 16s)
  • 22:37 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
  • 22:34 brennen@deploy1002: brennen and brennen: Backport for MediaWiki: Temp silence FR-induced clearActionName warnings (T323254) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 22:34 brennen@deploy1002: Started scap: Backport for MediaWiki: Temp silence FR-induced clearActionName warnings (T323254)
  • 21:58 krinkle@deploy1002: Finished scap: Backport for Enable logging for 'rdbms' channel (T320873) (duration: 08m 54s)
  • 21:49 krinkle@deploy1002: krinkle and krinkle: Backport for Enable logging for 'rdbms' channel (T320873) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 21:49 krinkle@deploy1002: Started scap: Backport for Enable logging for 'rdbms' channel (T320873)
  • 21:44 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2173']
  • 21:42 andrew@cumin1001: START - Cookbook sre.dns.netbox
  • 21:42 andrew@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 21:41 andrew@cumin1001: START - Cookbook sre.dns.netbox
  • 21:37 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2173']
  • 21:33 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:31 andrew@cumin1001: START - Cookbook sre.dns.netbox
  • 21:19 TheresNoTime: closing UTC late backport window
  • 21:08 samtar@deploy1002: Finished scap: Backport for Increase CirrusSearch-Search pool counter by 10% (duration: 05m 19s)
  • 21:03 samtar@deploy1002: samtar and ebernhardson: Backport for Increase CirrusSearch-Search pool counter by 10% synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 21:03 samtar@deploy1002: Started scap: Backport for Increase CirrusSearch-Search pool counter by 10%
  • 21:02 mutante: replacing phab2001 (decom'ed) with phab2002 in Phabricator SPF TXT record in DNS
  • 20:52 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts puppetdb2003.codfw.wmnet
  • 20:46 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts puppetdb2003.codfw.wmnet
  • 20:46 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts puppetdb2003.codfw.wmnet
  • 20:46 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts puppetdb2003.codfw.wmnet
  • 20:40 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2052.codfw.wmnet with OS bullseye
  • 20:15 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2052.codfw.wmnet with reason: host reimage
  • 20:11 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2052.codfw.wmnet with reason: host reimage
  • 19:54 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2052.codfw.wmnet with OS bullseye
  • 19:16 brennen@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.10 refs T320515 (duration: 03m 40s)
  • 19:15 volans: installed spicerack v5.0.2 on the cumin hosts
  • 19:13 volans: uploaded spicerack_5.0.2 to apt.wikimedia.org bullseye-wikimedia
  • 19:13 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.10 refs T320515
  • 19:06 brennen: train 1.40.0-wmf.10 (T320515) - no current blockers; rolling first to group1, 10 minutes or so to bake in, then will attempt all wikis.
  • 19:01 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts puppetdb2003.codfw.wmnet
  • 18:59 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2042.codfw.wmnet
  • 18:57 brennen@deploy1002: Finished scap: no-op deploy to attempt re-pull on parse1015.eqiad.wmnet (duration: 04m 21s)
  • 18:52 brennen@deploy1002: Started scap: no-op deploy to attempt re-pull on parse1015.eqiad.wmnet
  • 18:48 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@fb7d161]: 0.3.118 (duration: 11m 12s)
  • 18:44 volans: upgraded spicerack to v5.0.1 on the cumin hosts
  • 18:36 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@fb7d161]: 0.3.118
  • 18:27 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1001.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 18:26 volans@cumin2002: START - Cookbook sre.hosts.provision for host sretest1001.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 18:17 brennen@deploy1002: Finished deploy [phabricator/deployment@f68dc24]: deploy mysql.port value to local config (duration: 00m 58s)
  • 18:16 brennen@deploy1002: Started deploy [phabricator/deployment@f68dc24]: deploy mysql.port value to local config
  • 18:14 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts puppetdb2003.codfw.wmnet
  • 18:05 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2008.codfw.wmnet
  • 18:05 hnowlan@cumin1001: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
  • 17:59 brennen@deploy1002: Finished scap: Backport for InitializeArticleMaybeRedirect hook: Improve docs & restrict (T323254) (duration: 05m 55s)
  • 17:58 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 17:54 brennen@deploy1002: brennen and krinkle: Backport for InitializeArticleMaybeRedirect hook: Improve docs & restrict (T323254) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 17:53 brennen@deploy1002: Started scap: Backport for InitializeArticleMaybeRedirect hook: Improve docs & restrict (T323254)
  • 17:46 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 17:45 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 17:45 jbond@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host sretest1001.eqiad.wmnet
  • 17:22 jbond@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1001.eqiad.wmnet
  • 17:11 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 17:10 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 17:10 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 16:55 volans: uploaded spicerack_5.0.1 to apt.wikimedia.org bullseye-wikimedia
  • 16:48 jnuche@deploy1002: Installing scap version "4.28.2" for 1 hosts
  • 16:46 jnuche@deploy1002: Finished scap: testing k8s deploys (duration: 15m 19s)
  • 16:43 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 16:41 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 16:40 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 16:40 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 16:40 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 16:40 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 16:37 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 16:37 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 16:37 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 16:37 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 16:37 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 16:37 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 16:37 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 16:36 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 16:36 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 16:36 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 16:36 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 16:36 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 16:33 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 16:33 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 16:33 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 16:33 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 16:33 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 16:33 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 16:33 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 16:33 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 16:33 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 16:32 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 16:32 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 16:32 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 16:32 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 16:32 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 16:32 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 16:32 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 16:32 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 16:31 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 16:31 jnuche@deploy1002: Started scap: testing k8s deploys
  • 16:23 jnuche@deploy1002: Installing scap version "4.28.2" for 559 hosts
  • 16:12 moritzm: active CAS instance has been switched to CAS 6.6.2 (from 6.4.6.3) T311235
  • 16:10 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@d33ab6c]: implement incoming_links update as a batch job (duration: 02m 26s)
  • 16:08 ladsgroup@deploy1002: Finished scap: Backport for Get rid of extract2.php (T273179) (duration: 05m 51s)
  • 16:08 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@d33ab6c]: implement incoming_links update as a batch job
  • 16:03 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for Get rid of extract2.php (T273179) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
  • 16:02 ladsgroup@deploy1002: Started scap: Backport for Get rid of extract2.php (T273179)
  • 16:01 mforns@deploy1002: Finished deploy [analytics/refinery@d7388a6] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d7388a6] (duration: 01m 13s)
  • 16:00 mforns@deploy1002: Started deploy [analytics/refinery@d7388a6] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d7388a6]
  • 16:00 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 15:59 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 15:59 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 15:59 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 15:59 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 15:59 mforns@deploy1002: Finished deploy [analytics/refinery@d7388a6] (thin): Regular analytics weekly train THIN [analytics/refinery@d7388a6] (duration: 00m 08s)
  • 15:59 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 15:59 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 15:59 mforns@deploy1002: Started deploy [analytics/refinery@d7388a6] (thin): Regular analytics weekly train THIN [analytics/refinery@d7388a6]
  • 15:57 mforns@deploy1002: Finished deploy [analytics/refinery@d7388a6]: Regular analytics weekly train [analytics/refinery@d7388a6] (duration: 05m 15s)
  • 15:56 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 15:56 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 15:55 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 15:55 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 15:55 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 15:55 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 15:55 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 15:55 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 15:52 mforns@deploy1002: Started deploy [analytics/refinery@d7388a6]: Regular analytics weekly train [analytics/refinery@d7388a6]
  • 15:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 15:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 15:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T323214)', diff saved to https://phabricator.wikimedia.org/P40117 and previous config saved to /var/cache/conftool/dbconfig/20221117-154855-ladsgroup.json
  • 15:45 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 15:45 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 15:45 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 15:45 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 15:45 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 15:43 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
  • 15:42 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 15:42 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 15:42 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 15:42 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 15:42 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 15:42 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 15:42 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 15:42 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 15:42 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 15:42 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 15:41 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 15:41 jnuche@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 15:41 jnuche@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 15:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1019.eqiad.wmnet with OS bullseye
  • 15:39 hnowlan@cumin1001: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
  • 15:38 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
  • 15:37 hnowlan@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host maps2008.codfw.wmnet
  • 15:37 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host maps2008.codfw.wmnet
  • 15:37 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2008.codfw.wmnet
  • 15:34 jnuche@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 15:34 jnuche@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 15:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P40116 and previous config saved to /var/cache/conftool/dbconfig/20221117-153348-ladsgroup.json
  • 15:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1019.eqiad.wmnet with reason: host reimage
  • 15:23 jnuche@deploy1002: Started scap: testing k8s deploys
  • 15:21 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1019.eqiad.wmnet with reason: host reimage
  • 15:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P40115 and previous config saved to /var/cache/conftool/dbconfig/20221117-151842-ladsgroup.json
  • 15:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1019.eqiad.wmnet with OS bullseye
  • 15:04 ladsgroup@deploy1002: Finished scap: Backport for Move api/index.html to docroot (T273179) (duration: 07m 07s)
  • 15:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T323214)', diff saved to https://phabricator.wikimedia.org/P40114 and previous config saved to /var/cache/conftool/dbconfig/20221117-150335-ladsgroup.json
  • 15:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 14:57 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for Move api/index.html to docroot (T273179) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
  • 14:57 vgutierrez: vgutierrez@apt1001:~$ sudo -i reprepro --component thirdparty/haproxy24 update bullseye-wikimedia
  • 14:57 ladsgroup@deploy1002: Started scap: Backport for Move api/index.html to docroot (T273179)
  • 14:55 vgutierrez: vgutierrez@apt1001:~$ sudo -i reprepro clearvanished
  • 14:55 urbanecm@deploy1002: Finished scap: 4e419212: f659d88b: 65cd6881: 96e86cf: 5b94aca: 7a06c4b98: DiscussionTools, GlobalUsage, MinervaNeue backports (T316175, T323171, T257394, T323241) (duration: 04m 29s)
  • 14:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 14:50 urbanecm@deploy1002: Started scap: 4e419212: f659d88b: 65cd6881: 96e86cf: 5b94aca: 7a06c4b98: DiscussionTools, GlobalUsage, MinervaNeue backports (T316175, T323171, T257394, T323241)
  • 14:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5002.eqsin.wmnet
  • 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
  • 14:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
  • 14:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5002.eqsin.wmnet
  • 14:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
  • 14:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2015.codfw.wmnet
  • 14:34 vgutierrez: depool cp2042
  • 14:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 (T323214)', diff saved to https://phabricator.wikimedia.org/P40113 and previous config saved to /var/cache/conftool/dbconfig/20221117-143334-ladsgroup.json
  • 14:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 14:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 14:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T323214)', diff saved to https://phabricator.wikimedia.org/P40112 and previous config saved to /var/cache/conftool/dbconfig/20221117-143313-ladsgroup.json
  • 14:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti1019.eqiad.wmnet with reason: Remove from cluster for eventual reimage
  • 14:30 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti1019.eqiad.wmnet with reason: Remove from cluster for eventual reimage
  • 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
  • 14:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2015.codfw.wmnet
  • 14:18 urbanecm@deploy1002: Sync cancelled.
  • 14:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P40111 and previous config saved to /var/cache/conftool/dbconfig/20221117-141806-ladsgroup.json
  • {{safesubst:SAL entry|1=14:14 urbanecm@deploy1002: urbanecm and matmarex: Backport for Make "Add topic" button sticky (T316175), CommentFormatter: Fix condition for lede button to consider new wrappers (T323171), Remove override for Minerva hiding .tmbox, no longer needed (T257394), CommentFormatter: Fix condition for lede button to consider table of contents (T323241), [[gerr}}
  • {{safesubst:SAL entry|1=14:13 urbanecm@deploy1002: Started scap: Backport for Make "Add topic" button sticky (T316175), CommentFormatter: Fix condition for lede button to consider new wrappers (T323171), Remove override for Minerva hiding .tmbox, no longer needed (T257394), CommentFormatter: Fix condition for lede button to consider table of contents (T323241), [[gerrit:858312}}
  • 14:12 urbanecm@deploy1002: Finished scap: Backport for fiwiktionary: Add rollbacker group (T323063) (duration: 06m 35s)
  • 14:06 urbanecm@deploy1002: urbanecm and stang: Backport for fiwiktionary: Add rollbacker group (T323063) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 14:05 urbanecm@deploy1002: Started scap: Backport for fiwiktionary: Add rollbacker group (T323063)
  • 14:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2020.codfw.wmnet
  • 14:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P40110 and previous config saved to /var/cache/conftool/dbconfig/20221117-140300-ladsgroup.json
  • 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2020.codfw.wmnet
  • 13:58 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 6774
  • 13:56 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 6774
  • 13:52 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2008.codfw.wmnet
  • 13:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T323214)', diff saved to https://phabricator.wikimedia.org/P40109 and previous config saved to /var/cache/conftool/dbconfig/20221117-134753-ladsgroup.json
  • 13:46 moritzm: failover ganeti master in codfw to ganeti2021
  • 13:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
  • 13:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
  • 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet
  • 13:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 (T323214)', diff saved to https://phabricator.wikimedia.org/P40108 and previous config saved to /var/cache/conftool/dbconfig/20221117-131709-ladsgroup.json
  • 13:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 13:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 13:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T323214)', diff saved to https://phabricator.wikimedia.org/P40107 and previous config saved to /var/cache/conftool/dbconfig/20221117-131647-ladsgroup.json
  • 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
  • 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2028.codfw.wmnet
  • 13:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2028.codfw.wmnet
  • 13:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P40106 and previous config saved to /var/cache/conftool/dbconfig/20221117-130141-ladsgroup.json
  • 12:55 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@4bdda20]: (no justification provided) (duration: 00m 18s)
  • 12:55 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@4bdda20]: (no justification provided)
  • 12:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P40105 and previous config saved to /var/cache/conftool/dbconfig/20221117-124634-ladsgroup.json
  • 12:32 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@3bb99c2]: (no justification provided) (duration: 00m 05s)
  • 12:32 mfossati@deploy1002: Started deploy [airflow-dags/platform_eng@3bb99c2]: (no justification provided)
  • 12:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T323214)', diff saved to https://phabricator.wikimedia.org/P40104 and previous config saved to /var/cache/conftool/dbconfig/20221117-123128-ladsgroup.json
  • 12:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2027.codfw.wmnet
  • 12:29 moritzm: installing bluez security updates
  • 12:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T318955)', diff saved to https://phabricator.wikimedia.org/P40103 and previous config saved to /var/cache/conftool/dbconfig/20221117-122532-ladsgroup.json
  • 12:24 moritzm: restarting slapd on serpens/seaborgium/ldap-corp* to pick up GNUTLS update
  • 12:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2027.codfw.wmnet
  • 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
  • 12:18 jmm@cumin2002: START - Cookbook sre.maps.roll-restart rolling restart_daemons on A:maps-replica-eqiad
  • 12:13 sukhe: rolling restart of A:wikidough to pick up security updates
  • 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
  • 12:12 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
  • 12:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40101 and previous config saved to /var/cache/conftool/dbconfig/20221117-121026-ladsgroup.json
  • 12:06 Emperor: restart swift proxies to deploy phonos changes to rewrite.py T317417
  • 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
  • 12:02 urbanecm: [urbanecm@mwmaint1002 ~]$ time mwscript extensions/GrowthExperiments/maintenance/updateIsActiveFlagForMentees.php --wiki=trwiki # T318457
  • 12:01 hashar: Gerrit back since 11:45 UTC
  • 12:01 urbanecm: [urbanecm@mwmaint1002 ~]$ time mwscript extensions/GrowthExperiments/maintenance/updateIsActiveFlagForMentees.php --wiki=enwiki # T318457
  • 11:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P40100 and previous config saved to /var/cache/conftool/dbconfig/20221117-115520-ladsgroup.json
  • 11:50 jmm@cumin2002: START - Cookbook sre.maps.roll-restart rolling restart_daemons on A:maps-replica-codfw
  • 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
  • 11:47 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5032.eqsin.wmnet
  • 11:47 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp5032.eqsin.wmnet
  • 11:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
  • 11:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T318955)', diff saved to https://phabricator.wikimedia.org/P40099 and previous config saved to /var/cache/conftool/dbconfig/20221117-114013-ladsgroup.json
  • 11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T323214)', diff saved to https://phabricator.wikimedia.org/P40098 and previous config saved to /var/cache/conftool/dbconfig/20221117-113814-ladsgroup.json
  • 11:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T318955)', diff saved to https://phabricator.wikimedia.org/P40097 and previous config saved to /var/cache/conftool/dbconfig/20221117-113621-ladsgroup.json
  • 11:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 11:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2024.codfw.wmnet
  • 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2024.codfw.wmnet
  • 11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P40096 and previous config saved to /var/cache/conftool/dbconfig/20221117-112307-ladsgroup.json
  • 11:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2022.codfw.wmnet
  • 11:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T323214)', diff saved to https://phabricator.wikimedia.org/P40095 and previous config saved to /var/cache/conftool/dbconfig/20221117-111745-ladsgroup.json
  • 11:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 11:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 11:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T323214)', diff saved to https://phabricator.wikimedia.org/P40094 and previous config saved to /var/cache/conftool/dbconfig/20221117-111712-ladsgroup.json
  • 11:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2022.codfw.wmnet
  • 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P40093 and previous config saved to /var/cache/conftool/dbconfig/20221117-110801-ladsgroup.json
  • 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2021.codfw.wmnet
  • 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P40092 and previous config saved to /var/cache/conftool/dbconfig/20221117-110206-ladsgroup.json
  • 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2021.codfw.wmnet
  • 10:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T323214)', diff saved to https://phabricator.wikimedia.org/P40091 and previous config saved to /var/cache/conftool/dbconfig/20221117-105254-ladsgroup.json
  • 10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P40090 and previous config saved to /var/cache/conftool/dbconfig/20221117-104659-ladsgroup.json
  • 10:45 moritzm: restarting apache/FPM on mw canaries to pick up gnutls security updates
  • 10:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T323214)', diff saved to https://phabricator.wikimedia.org/P40089 and previous config saved to /var/cache/conftool/dbconfig/20221117-103153-ladsgroup.json
  • 10:25 vgutierrez: pool ats-be@cp2042
  • 10:20 moritzm: installing gnutls28 security updates on Buster
  • 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2019.codfw.wmnet
  • 10:19 hashar: gerrit1001: removed 5G of 2019's thread dumps in `/srv/home-cobalt.wikimedia.org/thcipriani/threaddumps`
  • 10:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2019.codfw.wmnet
  • 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2018.codfw.wmnet
  • 09:56 hashar: Stopped Gerrit and running offline reindexing
  • 09:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2018.codfw.wmnet
  • 09:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2017.codfw.wmnet
  • 09:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2017.codfw.wmnet
  • 09:42 hashar: Cleaning gerrit1001.wikimedia.org `/` partition
  • 09:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2016.codfw.wmnet
  • 09:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 (T323214)', diff saved to https://phabricator.wikimedia.org/P40087 and previous config saved to /var/cache/conftool/dbconfig/20221117-093650-ladsgroup.json
  • 09:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 09:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 09:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T323214)', diff saved to https://phabricator.wikimedia.org/P40086 and previous config saved to /var/cache/conftool/dbconfig/20221117-093628-ladsgroup.json
  • 09:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2016.codfw.wmnet
  • 09:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T323214)', diff saved to https://phabricator.wikimedia.org/P40085 and previous config saved to /var/cache/conftool/dbconfig/20221117-092902-ladsgroup.json
  • 09:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 09:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 09:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T323214)', diff saved to https://phabricator.wikimedia.org/P40084 and previous config saved to /var/cache/conftool/dbconfig/20221117-092841-ladsgroup.json
  • 09:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2014.codfw.wmnet
  • 09:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P40083 and previous config saved to /var/cache/conftool/dbconfig/20221117-092121-ladsgroup.json
  • 09:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2014.codfw.wmnet
  • 09:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P40082 and previous config saved to /var/cache/conftool/dbconfig/20221117-091334-ladsgroup.json
  • 09:12 hashar: Bringing back primary Gerrit on gerrit1001
  • 09:11 hashar@deploy1002: Finished deploy [gerrit/gerrit@39d9f06]: Gerrit to 3.5.4 on gerrit1001 (duration: 00m 08s)
  • 09:10 hashar@deploy1002: Started deploy [gerrit/gerrit@39d9f06]: Gerrit to 3.5.4 on gerrit1001
  • 09:09 hashar: Upgrading Gerrit primary instance
  • 09:07 hashar: Bringing back Gerrit on gerrit2002
  • 09:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P40081 and previous config saved to /var/cache/conftool/dbconfig/20221117-090615-ladsgroup.json
  • 09:04 hashar@deploy1002: Finished deploy [gerrit/gerrit@39d9f06]: Gerrit to 3.5.4 on gerrit2002 (duration: 00m 10s)
  • 09:04 hashar@deploy1002: Started deploy [gerrit/gerrit@39d9f06]: Gerrit to 3.5.4 on gerrit2002
  • 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagetcd2002.codfw.wmnet to plain
  • 09:02 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagetcd2002.codfw.wmnet to plain
  • 08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P40080 and previous config saved to /var/cache/conftool/dbconfig/20221117-085828-ladsgroup.json
  • 08:55 krinkle@deploy1002: Finished deploy [integration/docroot@de83506]: (no justification provided) (duration: 00m 39s)
  • 08:55 krinkle@deploy1002: Started deploy [integration/docroot@de83506]: (no justification provided)
  • 08:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T323214)', diff saved to https://phabricator.wikimedia.org/P40079 and previous config saved to /var/cache/conftool/dbconfig/20221117-085108-ladsgroup.json
  • 08:50 moritzm: draining ganeti1019 for eventual reimage T311687
  • 08:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T323214)', diff saved to https://phabricator.wikimedia.org/P40078 and previous config saved to /var/cache/conftool/dbconfig/20221117-084321-ladsgroup.json
  • 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagetcd2002.codfw.wmnet to drbd
  • 08:21 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagetcd2002.codfw.wmnet to drbd
  • 08:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T323214)', diff saved to https://phabricator.wikimedia.org/P40076 and previous config saved to /var/cache/conftool/dbconfig/20221117-081413-ladsgroup.json
  • 08:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 08:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 08:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T323214)', diff saved to https://phabricator.wikimedia.org/P40075 and previous config saved to /var/cache/conftool/dbconfig/20221117-081352-ladsgroup.json
  • 07:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P40074 and previous config saved to /var/cache/conftool/dbconfig/20221117-075845-ladsgroup.json
  • 07:47 elukey: restart kube-apiserver on ml-serve-ctrl2002 - high LIST latencies for knative, attempt to clear them out
  • 07:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 (T323214)', diff saved to https://phabricator.wikimedia.org/P40073 and previous config saved to /var/cache/conftool/dbconfig/20221117-074732-ladsgroup.json
  • 07:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 07:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 07:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T323214)', diff saved to https://phabricator.wikimedia.org/P40071 and previous config saved to /var/cache/conftool/dbconfig/20221117-074721-ladsgroup.json
  • 07:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P40070 and previous config saved to /var/cache/conftool/dbconfig/20221117-074339-ladsgroup.json
  • 07:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P40069 and previous config saved to /var/cache/conftool/dbconfig/20221117-073215-ladsgroup.json
  • 07:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T323214)', diff saved to https://phabricator.wikimedia.org/P40068 and previous config saved to /var/cache/conftool/dbconfig/20221117-072832-ladsgroup.json
  • 07:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P40067 and previous config saved to /var/cache/conftool/dbconfig/20221117-071708-ladsgroup.json
  • 07:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T323214)', diff saved to https://phabricator.wikimedia.org/P40066 and previous config saved to /var/cache/conftool/dbconfig/20221117-070202-ladsgroup.json
  • 06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T323214)', diff saved to https://phabricator.wikimedia.org/P40065 and previous config saved to /var/cache/conftool/dbconfig/20221117-062643-ladsgroup.json
  • 06:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 06:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 06:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 06:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T323214)', diff saved to https://phabricator.wikimedia.org/P40064 and previous config saved to /var/cache/conftool/dbconfig/20221117-062604-ladsgroup.json
  • 06:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P40063 and previous config saved to /var/cache/conftool/dbconfig/20221117-061058-ladsgroup.json
  • 05:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 (T323214)', diff saved to https://phabricator.wikimedia.org/P40062 and previous config saved to /var/cache/conftool/dbconfig/20221117-055938-ladsgroup.json
  • 05:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 05:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 05:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T323214)', diff saved to https://phabricator.wikimedia.org/P40061 and previous config saved to /var/cache/conftool/dbconfig/20221117-055916-ladsgroup.json
  • 05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P40060 and previous config saved to /var/cache/conftool/dbconfig/20221117-055551-ladsgroup.json
  • 05:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P40059 and previous config saved to /var/cache/conftool/dbconfig/20221117-054409-ladsgroup.json
  • 05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T323214)', diff saved to https://phabricator.wikimedia.org/P40058 and previous config saved to /var/cache/conftool/dbconfig/20221117-054045-ladsgroup.json
  • 05:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P40057 and previous config saved to /var/cache/conftool/dbconfig/20221117-052903-ladsgroup.json
  • 05:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T323214)', diff saved to https://phabricator.wikimedia.org/P40056 and previous config saved to /var/cache/conftool/dbconfig/20221117-051357-ladsgroup.json
  • 04:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (