Server Admin Log

From Wikitech
Jump to navigation Jump to search

2024-02-23

  • 10:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1190 (T357189)', diff saved to https://phabricator.wikimedia.org/P57816 and previous config saved to /var/cache/conftool/dbconfig/20240223-105929-arnaudb.json
  • 10:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 10:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 10:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T357189)', diff saved to https://phabricator.wikimedia.org/P57815 and previous config saved to /var/cache/conftool/dbconfig/20240223-105907-arnaudb.json
  • 10:52 hnowlan: running homer 'cr*codfw*' commit 'T351074' for new appservers being migrated to k8s workers
  • 10:49 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1458.eqiad.wmnet|mw1467.eqiad.wmnet|mw1468.eqiad.wmnet|mw1483.eqiad.wmnet|mw1484.eqiad.wmnet|mw1485.eqiad.wmnet|mw1494.eqiad.wmnet),cluster=kubernetes,service=kubesvc
  • 10:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P57814 and previous config saved to /var/cache/conftool/dbconfig/20240223-104401-arnaudb.json
  • 10:41 hnowlan: running homer 'cr*eqiad*' commit 'T351074' && homer 'lsw1-f2-eqiad*' commit 'T351074' for jobrunners being migrated to k8s workers
  • 10:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P57813 and previous config saved to /var/cache/conftool/dbconfig/20240223-102854-arnaudb.json
  • 10:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 10:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 10:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T357189)', diff saved to https://phabricator.wikimedia.org/P57811 and previous config saved to /var/cache/conftool/dbconfig/20240223-101348-arnaudb.json
  • 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57810 and previous config saved to /var/cache/conftool/dbconfig/20240223-093559-root.json
  • 09:20 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57809 and previous config saved to /var/cache/conftool/dbconfig/20240223-092053-root.json
  • 09:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1160 (T357189)', diff saved to https://phabricator.wikimedia.org/P57808 and previous config saved to /var/cache/conftool/dbconfig/20240223-090913-arnaudb.json
  • 09:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 09:08 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 09:05 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57807 and previous config saved to /var/cache/conftool/dbconfig/20240223-090549-root.json
  • 08:54 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging GoranSMilovanovic out of all services on: 8 hosts
  • 08:53 root@cumin2002: START - Cookbook sre.idm.logout Logging GoranSMilovanovic out of all services on: 8 hosts
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57806 and previous config saved to /var/cache/conftool/dbconfig/20240223-085043-root.json
  • 08:35 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57805 and previous config saved to /var/cache/conftool/dbconfig/20240223-083538-root.json
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57804 and previous config saved to /var/cache/conftool/dbconfig/20240223-082033-root.json
  • 08:20 godog: rollout prometheus-rsyslog-exporter new version to remaining hosts, caching sites - T357616
  • 08:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 08:18 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 1%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57803 and previous config saved to /var/cache/conftool/dbconfig/20240223-080528-root.json
  • 08:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1031.eqiad.wmnet with OS bookworm
  • 07:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1031.eqiad.wmnet with reason: host reimage
  • 07:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1031.eqiad.wmnet with reason: host reimage
  • 07:40 marostegui: Install 10.6.17 on pc1014 T357089
  • 07:28 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1031.eqiad.wmnet with OS bookworm
  • 07:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1031 T358180', diff saved to https://phabricator.wikimedia.org/P57802 and previous config saved to /var/cache/conftool/dbconfig/20240223-071952-root.json
  • 01:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T357189)', diff saved to https://phabricator.wikimedia.org/P57801 and previous config saved to /var/cache/conftool/dbconfig/20240223-015907-arnaudb.json
  • 01:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P57800 and previous config saved to /var/cache/conftool/dbconfig/20240223-014400-arnaudb.json
  • 01:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P57799 and previous config saved to /var/cache/conftool/dbconfig/20240223-012853-arnaudb.json
  • 01:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T357189)', diff saved to https://phabricator.wikimedia.org/P57798 and previous config saved to /var/cache/conftool/dbconfig/20240223-011347-arnaudb.json
  • 01:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T357189)', diff saved to https://phabricator.wikimedia.org/P57797 and previous config saved to /var/cache/conftool/dbconfig/20240223-011128-arnaudb.json
  • 01:11 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 01:11 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 01:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T357189)', diff saved to https://phabricator.wikimedia.org/P57796 and previous config saved to /var/cache/conftool/dbconfig/20240223-011107-arnaudb.json
  • 00:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P57795 and previous config saved to /var/cache/conftool/dbconfig/20240223-005601-arnaudb.json
  • 00:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P57794 and previous config saved to /var/cache/conftool/dbconfig/20240223-004054-arnaudb.json
  • 00:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T357189)', diff saved to https://phabricator.wikimedia.org/P57793 and previous config saved to /var/cache/conftool/dbconfig/20240223-002547-arnaudb.json
  • 00:14 zabe@deploy2002: Finished scap: Backport for block: Pass wikiId to DatabaseBlock::getId in DatabaseBlockStore (T358208) (duration: 11m 02s)
  • 00:12 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user="Grandmaster Huon" . # T358022
  • 00:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T357189)', diff saved to https://phabricator.wikimedia.org/P57791 and previous config saved to /var/cache/conftool/dbconfig/20240223-000920-arnaudb.json
  • 00:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 00:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 00:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T357189)', diff saved to https://phabricator.wikimedia.org/P57790 and previous config saved to /var/cache/conftool/dbconfig/20240223-000858-arnaudb.json
  • 00:06 zabe@deploy2002: zabe: Continuing with sync
  • 00:04 zabe@deploy2002: zabe: Backport for block: Pass wikiId to DatabaseBlock::getId in DatabaseBlockStore (T358208) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 00:03 zabe@deploy2002: Started scap: Backport for block: Pass wikiId to DatabaseBlock::getId in DatabaseBlockStore (T358208)

2024-02-22

  • 23:59 tstarling@deploy2002: Finished scap: (no justification provided) (duration: 09m 40s)
  • 23:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P57789 and previous config saved to /var/cache/conftool/dbconfig/20240222-235351-arnaudb.json
  • 23:49 tstarling@deploy2002: Started scap: (no justification provided)
  • 23:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P57788 and previous config saved to /var/cache/conftool/dbconfig/20240222-233845-arnaudb.json
  • 23:35 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1035.eqiad.wmnet with reason: Bootstrapping — T354560
  • 23:35 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1035.eqiad.wmnet with reason: Bootstrapping — T354560
  • 23:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T357189)', diff saved to https://phabricator.wikimedia.org/P57787 and previous config saved to /var/cache/conftool/dbconfig/20240222-232338-arnaudb.json
  • 23:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2167 (T357189)', diff saved to https://phabricator.wikimedia.org/P57786 and previous config saved to /var/cache/conftool/dbconfig/20240222-232118-arnaudb.json
  • 23:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 23:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 23:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T357189)', diff saved to https://phabricator.wikimedia.org/P57785 and previous config saved to /var/cache/conftool/dbconfig/20240222-232056-arnaudb.json
  • 23:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P57784 and previous config saved to /var/cache/conftool/dbconfig/20240222-230549-arnaudb.json
  • 22:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P57783 and previous config saved to /var/cache/conftool/dbconfig/20240222-225042-arnaudb.json
  • 22:41 cjming: end of UTC late backport window
  • 22:40 cjming@deploy2002: Finished scap: Backport for Improve chunked upload jobs and abort assemble job if already in progress (T200820) (duration: 09m 46s)
  • 22:35 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T357189)', diff saved to https://phabricator.wikimedia.org/P57782 and previous config saved to /var/cache/conftool/dbconfig/20240222-223536-arnaudb.json
  • 22:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T357189)', diff saved to https://phabricator.wikimedia.org/P57781 and previous config saved to /var/cache/conftool/dbconfig/20240222-223314-arnaudb.json
  • 22:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 22:32 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 22:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T357189)', diff saved to https://phabricator.wikimedia.org/P57780 and previous config saved to /var/cache/conftool/dbconfig/20240222-223251-arnaudb.json
  • 22:32 cjming@deploy2002: bawolff and cjming: Continuing with sync
  • 22:32 cjming@deploy2002: bawolff and cjming: Backport for Improve chunked upload jobs and abort assemble job if already in progress (T200820) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:30 cjming@deploy2002: Started scap: Backport for Improve chunked upload jobs and abort assemble job if already in progress (T200820)
  • 22:30 cjming@deploy2002: Finished scap: Backport for testwiki: Allow modifying email in account vanishing contact form. (T343536) (duration: 09m 58s)
  • 22:22 cjming@deploy2002: cjming and dbrant: Continuing with sync
  • 22:21 cjming@deploy2002: cjming and dbrant: Backport for testwiki: Allow modifying email in account vanishing contact form. (T343536) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:20 cjming@deploy2002: Started scap: Backport for testwiki: Allow modifying email in account vanishing contact form. (T343536)
  • 22:18 cjming@deploy2002: Finished scap: Backport for Add verbiage for Account Vanishing contact page. (T343536) (duration: 27m 47s)
  • 22:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P57779 and previous config saved to /var/cache/conftool/dbconfig/20240222-221745-arnaudb.json
  • 22:06 cjming@deploy2002: dbrant and cjming: Continuing with sync
  • 22:05 cjming@deploy2002: dbrant and cjming: Backport for Add verbiage for Account Vanishing contact page. (T343536) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P57778 and previous config saved to /var/cache/conftool/dbconfig/20240222-220238-arnaudb.json
  • 21:51 cjming@deploy2002: Started scap: Backport for Add verbiage for Account Vanishing contact page. (T343536)
  • 21:50 cjming@deploy2002: Finished scap: Backport for Change font-size "Small" label to "Standard" (T358074) (duration: 29m 07s)
  • 21:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T357189)', diff saved to https://phabricator.wikimedia.org/P57777 and previous config saved to /var/cache/conftool/dbconfig/20240222-214732-arnaudb.json
  • 21:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T357189)', diff saved to https://phabricator.wikimedia.org/P57776 and previous config saved to /var/cache/conftool/dbconfig/20240222-214310-arnaudb.json
  • 21:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 21:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 21:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 21:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 21:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T357189)', diff saved to https://phabricator.wikimedia.org/P57775 and previous config saved to /var/cache/conftool/dbconfig/20240222-214221-arnaudb.json
  • 21:39 cjming@deploy2002: cjming and jdlrobson: Continuing with sync
  • 21:35 cjming@deploy2002: cjming and jdlrobson: Backport for Change font-size "Small" label to "Standard" (T358074) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P57774 and previous config saved to /var/cache/conftool/dbconfig/20240222-212715-arnaudb.json
  • 21:21 cjming@deploy2002: Started scap: Backport for Change font-size "Small" label to "Standard" (T358074)
  • 21:12 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2002.codfw.wmnet with OS bullseye
  • 21:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P57773 and previous config saved to /var/cache/conftool/dbconfig/20240222-211208-arnaudb.json
  • 21:01 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 20:57 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 20:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T357189)', diff saved to https://phabricator.wikimedia.org/P57772 and previous config saved to /var/cache/conftool/dbconfig/20240222-205701-arnaudb.json
  • 20:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T357189)', diff saved to https://phabricator.wikimedia.org/P57771 and previous config saved to /var/cache/conftool/dbconfig/20240222-205440-arnaudb.json
  • 20:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 20:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 20:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T357189)', diff saved to https://phabricator.wikimedia.org/P57770 and previous config saved to /var/cache/conftool/dbconfig/20240222-205417-arnaudb.json
  • 20:45 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS bullseye
  • 20:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P57769 and previous config saved to /var/cache/conftool/dbconfig/20240222-203911-arnaudb.json
  • 20:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P57768 and previous config saved to /var/cache/conftool/dbconfig/20240222-202404-arnaudb.json
  • 20:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T357189)', diff saved to https://phabricator.wikimedia.org/P57767 and previous config saved to /var/cache/conftool/dbconfig/20240222-200858-arnaudb.json
  • 20:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T357189)', diff saved to https://phabricator.wikimedia.org/P57766 and previous config saved to /var/cache/conftool/dbconfig/20240222-200636-arnaudb.json
  • 20:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 20:06 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host testvm2002.codfw.wmnet with OS bullseye
  • 20:06 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 20:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T357189)', diff saved to https://phabricator.wikimedia.org/P57765 and previous config saved to /var/cache/conftool/dbconfig/20240222-200614-arnaudb.json
  • 20:00 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 19:58 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 19:58 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 19:57 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 19:56 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 19:56 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 19:55 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 19:53 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 19:52 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 19:52 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 19:52 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 19:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P57764 and previous config saved to /var/cache/conftool/dbconfig/20240222-195108-arnaudb.json
  • 19:50 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 19:50 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 19:49 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 19:40 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS bullseye
  • 19:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P57763 and previous config saved to /var/cache/conftool/dbconfig/20240222-193601-arnaudb.json
  • 19:30 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:30 robh@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cleanup incorrect asset tags - robh@cumin2002"
  • 19:29 robh@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cleanup incorrect asset tags - robh@cumin2002"
  • 19:27 robh@cumin2002: START - Cookbook sre.dns.netbox
  • 19:23 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 19:22 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 19:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T357189)', diff saved to https://phabricator.wikimedia.org/P57762 and previous config saved to /var/cache/conftool/dbconfig/20240222-192055-arnaudb.json
  • 19:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T357189)', diff saved to https://phabricator.wikimedia.org/P57761 and previous config saved to /var/cache/conftool/dbconfig/20240222-191834-arnaudb.json
  • 19:18 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.19 refs T354437
  • 19:18 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 19:18 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 19:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T357189)', diff saved to https://phabricator.wikimedia.org/P57760 and previous config saved to /var/cache/conftool/dbconfig/20240222-191810-arnaudb.json
  • 19:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2385.codfw.wmnet with OS bullseye
  • 19:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P57759 and previous config saved to /var/cache/conftool/dbconfig/20240222-190304-arnaudb.json
  • 18:49 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2385.codfw.wmnet with reason: host reimage
  • 18:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P57758 and previous config saved to /var/cache/conftool/dbconfig/20240222-184757-arnaudb.json
  • 18:46 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2385.codfw.wmnet with reason: host reimage
  • 18:44 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2384.codfw.wmnet with OS bullseye
  • 18:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T357189)', diff saved to https://phabricator.wikimedia.org/P57757 and previous config saved to /var/cache/conftool/dbconfig/20240222-183251-arnaudb.json
  • 18:31 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2385.codfw.wmnet with OS bullseye
  • 18:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T357189)', diff saved to https://phabricator.wikimedia.org/P57756 and previous config saved to /var/cache/conftool/dbconfig/20240222-183030-arnaudb.json
  • 18:30 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 18:30 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 18:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T357189)', diff saved to https://phabricator.wikimedia.org/P57755 and previous config saved to /var/cache/conftool/dbconfig/20240222-183009-arnaudb.json
  • 18:28 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1485.eqiad.wmnet with OS bullseye
  • 18:25 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1467.eqiad.wmnet with OS bullseye
  • 18:24 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1494.eqiad.wmnet with OS bullseye
  • 18:22 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw
  • 18:22 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw
  • 18:22 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1484.eqiad.wmnet with OS bullseye
  • 18:21 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2384.codfw.wmnet with reason: host reimage
  • 18:18 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2384.codfw.wmnet with reason: host reimage
  • 18:17 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1468.eqiad.wmnet with OS bullseye
  • 18:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P57753 and previous config saved to /var/cache/conftool/dbconfig/20240222-181502-arnaudb.json
  • 18:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1483.eqiad.wmnet with OS bullseye
  • 18:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1458.eqiad.wmnet with OS bullseye
  • 18:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1485.eqiad.wmnet with reason: host reimage
  • 18:07 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1467.eqiad.wmnet with reason: host reimage
  • 18:04 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1484.eqiad.wmnet with reason: host reimage
  • 18:04 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 18:04 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 18:04 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 18:03 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2384.codfw.wmnet with OS bullseye
  • 18:03 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 18:03 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host mw2384.codfw.wmnet with OS bullseye
  • 18:03 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 18:02 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 18:01 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1494.eqiad.wmnet with reason: host reimage
  • 17:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P57752 and previous config saved to /var/cache/conftool/dbconfig/20240222-175956-arnaudb.json
  • 17:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1468.eqiad.wmnet with reason: host reimage
  • 17:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1483.eqiad.wmnet with reason: host reimage
  • 17:54 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1494.eqiad.wmnet with reason: host reimage
  • 17:54 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1483.eqiad.wmnet with reason: host reimage
  • 17:54 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1458.eqiad.wmnet with reason: host reimage
  • 17:54 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1484.eqiad.wmnet with reason: host reimage
  • 17:54 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1485.eqiad.wmnet with reason: host reimage
  • 17:54 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1468.eqiad.wmnet with reason: host reimage
  • 17:52 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1467.eqiad.wmnet with reason: host reimage
  • 17:52 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1458.eqiad.wmnet with reason: host reimage
  • 17:51 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2384.codfw.wmnet with OS bullseye
  • 17:45 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 17:44 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 17:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T357189)', diff saved to https://phabricator.wikimedia.org/P57751 and previous config saved to /var/cache/conftool/dbconfig/20240222-174449-arnaudb.json
  • 17:44 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 17:43 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 17:43 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 17:43 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 17:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T357189)', diff saved to https://phabricator.wikimedia.org/P57750 and previous config saved to /var/cache/conftool/dbconfig/20240222-174328-arnaudb.json
  • 17:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 17:43 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 17:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 17:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 17:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 17:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 17:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 17:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 17:42 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1494.eqiad.wmnet with OS bullseye
  • 17:41 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 17:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T357189)', diff saved to https://phabricator.wikimedia.org/P57749 and previous config saved to /var/cache/conftool/dbconfig/20240222-174138-arnaudb.json
  • 17:41 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1485.eqiad.wmnet with OS bullseye
  • 17:41 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1484.eqiad.wmnet with OS bullseye
  • 17:41 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1483.eqiad.wmnet with OS bullseye
  • 17:41 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1468.eqiad.wmnet with OS bullseye
  • 17:40 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 17:39 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1467.eqiad.wmnet with OS bullseye
  • 17:39 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1458.eqiad.wmnet with OS bullseye
  • 17:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 17:36 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 17:35 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host testvm2002.codfw.wmnet with OS bullseye
  • 17:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P57748 and previous config saved to /var/cache/conftool/dbconfig/20240222-172632-arnaudb.json
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P57747 and previous config saved to /var/cache/conftool/dbconfig/20240222-171125-arnaudb.json
  • 17:05 topranks: disabling IPv6 RAs for private1-a-codfw vlan on codfw core routers T355544
  • 16:58 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Remove legacy codfw vc switches from synced hiera data after netbox status change - cmooney@cumin1002 - T355544"
  • 16:57 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Remove legacy codfw vc switches from synced hiera data after netbox status change - cmooney@cumin1002 - T355544"
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T357189)', diff saved to https://phabricator.wikimedia.org/P57746 and previous config saved to /var/cache/conftool/dbconfig/20240222-165619-arnaudb.json
  • 16:56 topranks: disabling link from asw-a-codfw vc to ssw1-a1-codfw and ssw1-a8-codfw T355544
  • 16:54 dancy@deploy2002: Finished scap: testing T357402 again (duration: 08m 58s)
  • 16:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T357189)', diff saved to https://phabricator.wikimedia.org/P57745 and previous config saved to /var/cache/conftool/dbconfig/20240222-165401-arnaudb.json
  • 16:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 16:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 16:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 16:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 16:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T357189)', diff saved to https://phabricator.wikimedia.org/P57744 and previous config saved to /var/cache/conftool/dbconfig/20240222-165312-arnaudb.json
  • 16:45 dancy@deploy2002: Started scap: testing T357402 again
  • 16:43 dancy@deploy2002: sync-world aborted: testing T357402 (duration: 14m 57s)
  • 16:42 akosiaris@cumin1002: conftool action : set/pooled=inactive; selector: service=parsoid-php,name=kubernetes.*
  • 16:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P57743 and previous config saved to /var/cache/conftool/dbconfig/20240222-163806-arnaudb.json
  • 16:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:30 fabfur@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2032.codfw.wmnet,service=(cdn|ats-be)
  • 16:30 fabfur@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2031.codfw.wmnet,service=(cdn|ats-be)
  • 16:28 fabfur@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp[2031-2032].codfw.wmnet
  • 16:28 fabfur@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp[2031-2032].codfw.wmnet
  • 16:28 dancy@deploy2002: Started scap: testing T357402
  • 16:26 dancy@deploy2002: Installation of scap version "4.66.0" completed for 458 hosts
  • 16:25 dancy@deploy2002: Installing scap version "4.66.0" for 458 hosts
  • 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P57742 and previous config saved to /var/cache/conftool/dbconfig/20240222-162300-arnaudb.json
  • 16:22 volans@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 16:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 100%: After recloning', diff saved to https://phabricator.wikimedia.org/P57741 and previous config saved to /var/cache/conftool/dbconfig/20240222-162151-root.json
  • 16:19 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS bullseye
  • 16:16 volans@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 16:11 mvernon@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=swift,name=codfw
  • 16:11 Emperor: repool codfs-mw T355868
  • 16:10 Emperor: repool thanos-fe2002 T355868
  • 16:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T357189)', diff saved to https://phabricator.wikimedia.org/P57740 and previous config saved to /var/cache/conftool/dbconfig/20240222-160753-arnaudb.json
  • 16:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 75%: After recloning', diff saved to https://phabricator.wikimedia.org/P57739 and previous config saved to /var/cache/conftool/dbconfig/20240222-160646-root.json
  • 16:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T357189)', diff saved to https://phabricator.wikimedia.org/P57738 and previous config saved to /var/cache/conftool/dbconfig/20240222-160534-arnaudb.json
  • 16:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 16:05 volans@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 16:05 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 16:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T357189)', diff saved to https://phabricator.wikimedia.org/P57737 and previous config saved to /var/cache/conftool/dbconfig/20240222-160512-arnaudb.json
  • 16:04 volans@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 16:00 topranks: Commencing network maintenance migrating servers to new switch codfw rack B2 T355868
  • 15:58 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host testvm2002.codfw.wmnet with OS bullseye
  • 15:57 hnowlan: depooling mw[1458,1467-1468,1483-1485,1494].eqiad.wmnet in advance of reimaging
  • 15:56 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 25 hosts with reason: Migrating servers in codfw rack B2 to lsw1-b2-codfw
  • 15:55 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 25 hosts with reason: Migrating servers in codfw rack B2 to lsw1-b2-codfw
  • 15:54 mvernon@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=swift,name=codfw
  • 15:54 Emperor: depool codfs-mw T355868
  • 15:53 Emperor: depool thanos-fe2002 T355868
  • 15:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 50%: After recloning', diff saved to https://phabricator.wikimedia.org/P57736 and previous config saved to /var/cache/conftool/dbconfig/20240222-155141-root.json
  • 15:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P57735 and previous config saved to /var/cache/conftool/dbconfig/20240222-155005-arnaudb.json
  • 15:48 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b2-codfw.mgmt with reason: prepping for server uplink migration codfw rack b2
  • 15:48 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b2-codfw.mgmt with reason: prepping for server uplink migration codfw rack b2
  • 15:46 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp[2031-2032].codfw.wmnet with reason: T355868
  • 15:46 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on cp[2031-2032].codfw.wmnet with reason: T355868
  • 15:39 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster (duration: 00m 16s)
  • 15:39 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster
  • 15:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 25%: After recloning', diff saved to https://phabricator.wikimedia.org/P57734 and previous config saved to /var/cache/conftool/dbconfig/20240222-153636-root.json
  • 15:35 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P57733 and previous config saved to /var/cache/conftool/dbconfig/20240222-153459-arnaudb.json
  • 15:32 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 15:27 moritzm: installing glib2.0 security updates on bullseye
  • 15:27 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 15:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 10%: After recloning', diff saved to https://phabricator.wikimedia.org/P57732 and previous config saved to /var/cache/conftool/dbconfig/20240222-152131-root.json
  • 15:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T357189)', diff saved to https://phabricator.wikimedia.org/P57731 and previous config saved to /var/cache/conftool/dbconfig/20240222-151952-arnaudb.json
  • 15:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1211 (T357189)', diff saved to https://phabricator.wikimedia.org/P57730 and previous config saved to /var/cache/conftool/dbconfig/20240222-151733-arnaudb.json
  • 15:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 15:17 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 15:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T357189)', diff saved to https://phabricator.wikimedia.org/P57729 and previous config saved to /var/cache/conftool/dbconfig/20240222-151701-arnaudb.json
  • 15:15 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS bullseye
  • 15:15 akosiaris@cumin1002: conftool action : set/pooled=yes; selector: service=parsoid-php,name=kubernetes.*
  • 15:15 akosiaris: T357392 pool 46 kubernetes hosts of parsoid-php with a weight of 1. Since the 42 parse hosts are at weight 110, that means 1% goes to mw-parsoid deployment, aka mw-on-k8s
  • 15:13 akosiaris@cumin1002: conftool action : set/weight=1; selector: service=parsoid-php,name=kubernetes.*
  • 15:12 akosiaris@cumin1002: conftool action : set/weight=110; selector: service=parsoid-php,name=(pars.*|mw.*)
  • 15:12 akosiaris: Bump weight of old parsoid hosts from 10 to 110. This is a noop right now but will makes calculations later spelled out in T357392 possible.
  • 14:55 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 14:55 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 14:55 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 14:55 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 1%: After recloning', diff saved to https://phabricator.wikimedia.org/P57726 and previous config saved to /var/cache/conftool/dbconfig/20240222-145120-root.json
  • 14:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P57725 and previous config saved to /var/cache/conftool/dbconfig/20240222-144648-arnaudb.json
  • 14:45 cgoubert@deploy2002: Finished scap: Backport for Enable $wgLocalHTTPProxy on group1 wikis (T298265) (duration: 17m 46s)
  • 14:44 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-redacteddb1001.eqiad.wmnet with OS bullseye
  • 14:44 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bullseye
  • 14:37 cgoubert@deploy2002: cgoubert: Continuing with sync
  • 14:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T357189)', diff saved to https://phabricator.wikimedia.org/P57724 and previous config saved to /var/cache/conftool/dbconfig/20240222-143141-arnaudb.json
  • 14:29 cgoubert@deploy2002: cgoubert: Backport for Enable $wgLocalHTTPProxy on group1 wikis (T298265) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T357189)', diff saved to https://phabricator.wikimedia.org/P57723 and previous config saved to /var/cache/conftool/dbconfig/20240222-142921-arnaudb.json
  • 14:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 14:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 14:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57722 and previous config saved to /var/cache/conftool/dbconfig/20240222-142859-arnaudb.json
  • 14:28 cgoubert@deploy2002: Started scap: Backport for Enable $wgLocalHTTPProxy on group1 wikis (T298265)
  • 14:15 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57721 and previous config saved to /var/cache/conftool/dbconfig/20240222-141508-root.json
  • 14:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P57720 and previous config saved to /var/cache/conftool/dbconfig/20240222-141353-arnaudb.json
  • 14:03 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 14:00 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57719 and previous config saved to /var/cache/conftool/dbconfig/20240222-140003-root.json
  • 13:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P57718 and previous config saved to /var/cache/conftool/dbconfig/20240222-135846-arnaudb.json
  • 13:53 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:46 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1034.eqiad.wmnet with OS bookworm
  • 13:46 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:45 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:45 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:45 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57717 and previous config saved to /var/cache/conftool/dbconfig/20240222-134458-root.json
  • 13:44 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57716 and previous config saved to /var/cache/conftool/dbconfig/20240222-134340-arnaudb.json
  • 13:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57715 and previous config saved to /var/cache/conftool/dbconfig/20240222-134120-arnaudb.json
  • 13:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 13:41 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 13:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T357189)', diff saved to https://phabricator.wikimedia.org/P57714 and previous config saved to /var/cache/conftool/dbconfig/20240222-134059-arnaudb.json
  • 13:40 aborrero@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1034
  • 13:40 aborrero@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1034
  • 13:34 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:29 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57713 and previous config saved to /var/cache/conftool/dbconfig/20240222-132953-root.json
  • 13:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P57712 and previous config saved to /var/cache/conftool/dbconfig/20240222-132551-arnaudb.json
  • 13:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:20 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1034.eqiad.wmnet with reason: host reimage
  • 13:18 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1034.eqiad.wmnet with reason: host reimage
  • 13:14 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57711 and previous config saved to /var/cache/conftool/dbconfig/20240222-131448-root.json
  • 13:13 godog: bounce grafana to apply new datasources
  • 13:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P57710 and previous config saved to /var/cache/conftool/dbconfig/20240222-131045-arnaudb.json
  • 13:05 Emperor: ms-codfw set ACL {"read-only":["mw:backup"]} T269108
  • 13:03 Emperor: ms-eqiad set ACL {"read-only":["mw:backup"]} T269108
  • 13:02 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrading gitlab
  • 13:01 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1034.eqiad.wmnet with OS bookworm
  • 12:59 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57709 and previous config saved to /var/cache/conftool/dbconfig/20240222-125943-root.json
  • 12:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T357189)', diff saved to https://phabricator.wikimedia.org/P57708 and previous config saved to /var/cache/conftool/dbconfig/20240222-125538-arnaudb.json
  • 12:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T357189)', diff saved to https://phabricator.wikimedia.org/P57707 and previous config saved to /var/cache/conftool/dbconfig/20240222-125319-arnaudb.json
  • 12:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 12:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 12:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T357189)', diff saved to https://phabricator.wikimedia.org/P57706 and previous config saved to /var/cache/conftool/dbconfig/20240222-125257-arnaudb.json
  • 12:52 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrading gitlab
  • 12:45 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrading gitlab
  • 12:44 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 1%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57705 and previous config saved to /var/cache/conftool/dbconfig/20240222-124438-root.json
  • 12:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P57704 and previous config saved to /var/cache/conftool/dbconfig/20240222-123750-arnaudb.json
  • 12:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P57703 and previous config saved to /var/cache/conftool/dbconfig/20240222-122244-arnaudb.json
  • 12:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T357189)', diff saved to https://phabricator.wikimedia.org/P57702 and previous config saved to /var/cache/conftool/dbconfig/20240222-120737-arnaudb.json
  • 12:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1178 (T357189)', diff saved to https://phabricator.wikimedia.org/P57701 and previous config saved to /var/cache/conftool/dbconfig/20240222-120518-arnaudb.json
  • 12:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 12:04 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 12:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T357189)', diff saved to https://phabricator.wikimedia.org/P57700 and previous config saved to /var/cache/conftool/dbconfig/20240222-120445-arnaudb.json
  • 12:02 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrading gitlab
  • 11:55 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrading gitlab
  • 11:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 11:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 11:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 11:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 11:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1028.eqiad.wmnet with OS bookworm
  • 11:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P57699 and previous config saved to /var/cache/conftool/dbconfig/20240222-114938-arnaudb.json
  • 11:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P57698 and previous config saved to /var/cache/conftool/dbconfig/20240222-113432-arnaudb.json
  • 11:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1028.eqiad.wmnet with reason: host reimage
  • 11:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1028.eqiad.wmnet with reason: host reimage
  • 11:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T357189)', diff saved to https://phabricator.wikimedia.org/P57697 and previous config saved to /var/cache/conftool/dbconfig/20240222-111925-arnaudb.json
  • 11:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T357189)', diff saved to https://phabricator.wikimedia.org/P57696 and previous config saved to /var/cache/conftool/dbconfig/20240222-111706-arnaudb.json
  • 11:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 11:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 11:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T357189)', diff saved to https://phabricator.wikimedia.org/P57695 and previous config saved to /var/cache/conftool/dbconfig/20240222-111644-arnaudb.json
  • 11:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1028.eqiad.wmnet with OS bookworm
  • 11:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1028 T358180', diff saved to https://phabricator.wikimedia.org/P57694 and previous config saved to /var/cache/conftool/dbconfig/20240222-110914-root.json
  • 11:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P57693 and previous config saved to /var/cache/conftool/dbconfig/20240222-110138-arnaudb.json
  • 10:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P57692 and previous config saved to /var/cache/conftool/dbconfig/20240222-104632-arnaudb.json
  • 10:35 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet,service=s5
  • 10:35 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet,service=s8
  • 10:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T357189)', diff saved to https://phabricator.wikimedia.org/P57690 and previous config saved to /var/cache/conftool/dbconfig/20240222-103125-arnaudb.json
  • 10:31 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=s8
  • 10:31 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=s5
  • 10:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T357189)', diff saved to https://phabricator.wikimedia.org/P57689 and previous config saved to /var/cache/conftool/dbconfig/20240222-102906-arnaudb.json
  • 10:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 10:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 10:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 10:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 10:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T357189)', diff saved to https://phabricator.wikimedia.org/P57688 and previous config saved to /var/cache/conftool/dbconfig/20240222-102817-arnaudb.json
  • 10:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P57687 and previous config saved to /var/cache/conftool/dbconfig/20240222-101310-arnaudb.json
  • 10:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2195 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57686 and previous config saved to /var/cache/conftool/dbconfig/20240222-101123-arnaudb.json
  • 10:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57685 and previous config saved to /var/cache/conftool/dbconfig/20240222-101018-arnaudb.json
  • 10:01 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57684 and previous config saved to /var/cache/conftool/dbconfig/20240222-100140-root.json
  • 09:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P57683 and previous config saved to /var/cache/conftool/dbconfig/20240222-095804-arnaudb.json
  • 09:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2195 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57682 and previous config saved to /var/cache/conftool/dbconfig/20240222-095619-arnaudb.json
  • 09:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57681 and previous config saved to /var/cache/conftool/dbconfig/20240222-095513-arnaudb.json
  • 09:46 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57680 and previous config saved to /var/cache/conftool/dbconfig/20240222-094635-root.json
  • 09:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T357189)', diff saved to https://phabricator.wikimedia.org/P57679 and previous config saved to /var/cache/conftool/dbconfig/20240222-094257-arnaudb.json
  • 09:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2195 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57678 and previous config saved to /var/cache/conftool/dbconfig/20240222-094114-arnaudb.json
  • 09:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57677 and previous config saved to /var/cache/conftool/dbconfig/20240222-094008-arnaudb.json
  • 09:31 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57675 and previous config saved to /var/cache/conftool/dbconfig/20240222-093130-root.json
  • 09:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2195 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57674 and previous config saved to /var/cache/conftool/dbconfig/20240222-092609-arnaudb.json
  • 09:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57673 and previous config saved to /var/cache/conftool/dbconfig/20240222-092503-arnaudb.json
  • 09:16 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57672 and previous config saved to /var/cache/conftool/dbconfig/20240222-091626-root.json
  • 09:03 jayme: restart prometheus@k8s in eqiad - T343529
  • 09:01 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57671 and previous config saved to /var/cache/conftool/dbconfig/20240222-090121-root.json
  • 09:01 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2143.codfw.wmnet
  • 09:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2195.codfw.wmnet
  • 08:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1180.eqiad.wmnet
  • 08:58 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 100%: After migration', diff saved to https://phabricator.wikimedia.org/P57670 and previous config saved to /var/cache/conftool/dbconfig/20240222-085800-root.json
  • 08:56 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2195.codfw.wmnet
  • 08:55 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2143.codfw.wmnet
  • 08:55 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1180.eqiad.wmnet
  • 08:55 arnaudb@cumin1002: dbctl commit (dc=all): 'T356240 - depooling db1187 db2143 db2195', diff saved to https://phabricator.wikimedia.org/P57669 and previous config saved to /var/cache/conftool/dbconfig/20240222-085521-arnaudb.json
  • 08:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db[2143,2195].codfw.wmnet,db1187.eqiad.wmnet with reason: Silence for reboot T356240
  • 08:52 jayme: rolling out prometheus-rsyslog-exporter 1.0.0+git20221110-1 to wikikube nodes - T357616
  • 08:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db[2143,2195].codfw.wmnet,db1187.eqiad.wmnet with reason: Silence for reboot T356240
  • 08:46 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57668 and previous config saved to /var/cache/conftool/dbconfig/20240222-084616-root.json
  • 08:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetmaster1002.eqiad.wmnet
  • 08:42 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 75%: After migration', diff saved to https://phabricator.wikimedia.org/P57667 and previous config saved to /var/cache/conftool/dbconfig/20240222-084255-root.json
  • 08:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T357189)', diff saved to https://phabricator.wikimedia.org/P57666 and previous config saved to /var/cache/conftool/dbconfig/20240222-084235-arnaudb.json
  • 08:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:42 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host puppetmaster1002.eqiad.wmnet
  • 08:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 08:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 1%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57665 and previous config saved to /var/cache/conftool/dbconfig/20240222-083111-root.json
  • 08:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2033.codfw.wmnet with OS bookworm
  • 08:29 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 18779
  • 08:28 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 18779
  • 08:27 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 50%: After migration', diff saved to https://phabricator.wikimedia.org/P57664 and previous config saved to /var/cache/conftool/dbconfig/20240222-082750-root.json
  • 08:25 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 138997
  • 08:24 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 138997
  • 08:24 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 138997
  • 08:23 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 138997
  • 08:21 hoo@deploy2002: Finished scap: Backport for Migrate to virtual domain mapping (T348526), Migrate to virtual domain mapping (T348526) (duration: 14m 44s)
  • 08:20 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1
  • 08:20 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s3
  • 08:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2033.codfw.wmnet with reason: host reimage
  • 08:13 hoo@deploy2002: hoo: Continuing with sync
  • 08:12 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 25%: After migration', diff saved to https://phabricator.wikimedia.org/P57663 and previous config saved to /var/cache/conftool/dbconfig/20240222-081243-root.json
  • 08:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2033.codfw.wmnet with reason: host reimage
  • 08:08 hoo@deploy2002: hoo: Backport for Migrate to virtual domain mapping (T348526), Migrate to virtual domain mapping (T348526) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:06 hoo@deploy2002: Started scap: Backport for Migrate to virtual domain mapping (T348526), Migrate to virtual domain mapping (T348526)
  • 07:58 taavi: taavi@puppetmaster1002 ~ $ sudo systemctl restart apache2 # lots of 'Error 500 on SERVER: Server Error: undefined method `content' for nil:NilClass' in the logs, seems to have helped
  • 07:57 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 10%: After migration', diff saved to https://phabricator.wikimedia.org/P57662 and previous config saved to /var/cache/conftool/dbconfig/20240222-075738-root.json
  • 07:54 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2033.codfw.wmnet with OS bookworm
  • 07:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57661 and previous config saved to /var/cache/conftool/dbconfig/20240222-075448-root.json
  • 07:42 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 5%: After migration', diff saved to https://phabricator.wikimedia.org/P57660 and previous config saved to /var/cache/conftool/dbconfig/20240222-074233-root.json
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2033 T358080', diff saved to https://phabricator.wikimedia.org/P57659 and previous config saved to /var/cache/conftool/dbconfig/20240222-074042-root.json
  • 07:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57658 and previous config saved to /var/cache/conftool/dbconfig/20240222-073943-root.json
  • 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2026 as es2 codfw master T358080', diff saved to https://phabricator.wikimedia.org/P57657 and previous config saved to /var/cache/conftool/dbconfig/20240222-073017-marostegui.json
  • 07:27 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 1%: After migration', diff saved to https://phabricator.wikimedia.org/P57656 and previous config saved to /var/cache/conftool/dbconfig/20240222-072729-root.json
  • 07:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57655 and previous config saved to /var/cache/conftool/dbconfig/20240222-072438-root.json
  • 07:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1033.eqiad.wmnet with OS bookworm
  • 07:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57654 and previous config saved to /var/cache/conftool/dbconfig/20240222-070933-root.json
  • 06:58 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on es1033.eqiad.wmnet with reason: host reimage
  • 06:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1033.eqiad.wmnet with reason: host reimage
  • 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57653 and previous config saved to /var/cache/conftool/dbconfig/20240222-065428-root.json
  • 06:48 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
  • 06:48 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s1
  • 06:48 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1
  • 06:47 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s3
  • 06:47 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s1
  • 06:46 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s1
  • 06:46 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s3
  • 06:44 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1033.eqiad.wmnet with OS bookworm
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1033 T358080', diff saved to https://phabricator.wikimedia.org/P57652 and previous config saved to /var/cache/conftool/dbconfig/20240222-064253-root.json
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1030 as es2 master T358080', diff saved to https://phabricator.wikimedia.org/P57651 and previous config saved to /var/cache/conftool/dbconfig/20240222-064205-marostegui.json
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57650 and previous config saved to /var/cache/conftool/dbconfig/20240222-063923-root.json
  • 01:29 eileen: config revision changed from 5bdfab7a to b221a95a
  • 01:28 eileen: config revision changed from 5bdfab7a to b221a95a
  • 01:27 eileen: civicrm upgraded from cd839468 to c50fcae3
  • 00:43 rzl: rzl@lists1001:~$ sudo systemctl restart mailman3 # T358020
  • 00:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T357189)', diff saved to https://phabricator.wikimedia.org/P57649 and previous config saved to /var/cache/conftool/dbconfig/20240222-001210-arnaudb.json

2024-02-21

  • 23:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P57648 and previous config saved to /var/cache/conftool/dbconfig/20240221-235703-arnaudb.json
  • 23:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P57647 and previous config saved to /var/cache/conftool/dbconfig/20240221-234156-arnaudb.json
  • 23:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 eileen: config revision changed from c6fc16bb to 5bdfab7a
  • 23:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T357189)', diff saved to https://phabricator.wikimedia.org/P57646 and previous config saved to /var/cache/conftool/dbconfig/20240221-232649-arnaudb.json
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T357189)', diff saved to https://phabricator.wikimedia.org/P57645 and previous config saved to /var/cache/conftool/dbconfig/20240221-225350-arnaudb.json
  • 22:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 22:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 22:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T357189)', diff saved to https://phabricator.wikimedia.org/P57644 and previous config saved to /var/cache/conftool/dbconfig/20240221-225326-arnaudb.json
  • 22:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P57643 and previous config saved to /var/cache/conftool/dbconfig/20240221-223819-arnaudb.json
  • 22:29 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@8a290df]: new allowlisted endpoints for wdqs (duration: 11m 59s)
  • 22:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P57642 and previous config saved to /var/cache/conftool/dbconfig/20240221-222313-arnaudb.json
  • 22:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:17 ryankemper@deploy2002: Started deploy [wdqs/wdqs@8a290df]: new allowlisted endpoints for wdqs
  • 22:12 Dreamy_Jazz: Evening UTC backport window done
  • 22:10 ryankemper: [WDQS] T355868 Depooling `wdqs2024`, `wdqs2014,` `wdqs2010` in anticipation of row maintenance
  • 22:08 dreamyjazz@deploy2002: Finished scap: Backport for Pin wgGlobalBlockingAllowGlobalAccountBlocks as false on WMF wikis (T356923 T356924) (duration: 10m 16s)
  • 22:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T357189)', diff saved to https://phabricator.wikimedia.org/P57641 and previous config saved to /var/cache/conftool/dbconfig/20240221-220807-arnaudb.json
  • 22:02 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic2041*,elastic2042*,elastic2057*,elastic2063*,elastic2064*,elastic2077*,elastic2078*,elastic2092*,elastic2093*,elastic2094* for switch maintenance - bking@cumin2002 - T355860
  • 22:02 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic2041*,elastic2042*,elastic2057*,elastic2063*,elastic2064*,elastic2077*,elastic2078*,elastic2092*,elastic2093*,elastic2094* for switch maintenance - bking@cumin2002 - T355860
  • 22:00 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 22:00 dreamyjazz@deploy2002: dreamyjazz: Backport for Pin wgGlobalBlockingAllowGlobalAccountBlocks as false on WMF wikis (T356923 T356924) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:58 dreamyjazz@deploy2002: Started scap: Backport for Pin wgGlobalBlockingAllowGlobalAccountBlocks as false on WMF wikis (T356923 T356924)
  • 21:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T357189)', diff saved to https://phabricator.wikimedia.org/P57640 and previous config saved to /var/cache/conftool/dbconfig/20240221-215620-arnaudb.json
  • 21:56 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 21:56 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 21:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T357189)', diff saved to https://phabricator.wikimedia.org/P57639 and previous config saved to /var/cache/conftool/dbconfig/20240221-215558-arnaudb.json
  • 21:54 jhuneidi@deploy2002: Finished scap: Backport for cswiki, commonswiki, enwiki: fix IP cap date and IP for WikiGap Editathon (T357978) (duration: 10m 47s)
  • 21:52 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1034.eqiad.wmnet with reason: Bootstrapping — T354560
  • 21:52 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1034.eqiad.wmnet with reason: Bootstrapping — T354560
  • 21:51 urandom: boostrapping Cassandra, restbase1034-{a,b,c} — T354560
  • 21:46 jhuneidi@deploy2002: anzx and jhuneidi: Continuing with sync
  • 21:45 jhuneidi@deploy2002: anzx and jhuneidi: Backport for cswiki, commonswiki, enwiki: fix IP cap date and IP for WikiGap Editathon (T357978) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncmonitor1001.eqiad.wmnet with OS bookworm
  • 21:43 jhuneidi@deploy2002: Started scap: Backport for cswiki, commonswiki, enwiki: fix IP cap date and IP for WikiGap Editathon (T357978)
  • 21:42 jhuneidi@deploy2002: Finished scap: Backport for Remove Japanese Wikipedia from projects sharing user scripts (T301212), Enable night mode on beta cluster (T357759) (duration: 15m 25s)
  • 21:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P57638 and previous config saved to /var/cache/conftool/dbconfig/20240221-214052-arnaudb.json
  • 21:34 jhuneidi@deploy2002: jdlrobson and jhuneidi: Continuing with sync
  • 21:32 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 21:31 rzl@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 21:31 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncmonitor1001.eqiad.wmnet with reason: host reimage
  • 21:29 jhuneidi@deploy2002: jdlrobson and jhuneidi: Backport for Remove Japanese Wikipedia from projects sharing user scripts (T301212), Enable night mode on beta cluster (T357759) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:27 jhuneidi@deploy2002: Started scap: Backport for Remove Japanese Wikipedia from projects sharing user scripts (T301212), Enable night mode on beta cluster (T357759)
  • 21:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncmonitor1001.eqiad.wmnet with reason: host reimage
  • 21:26 jhuneidi@deploy2002: Finished scap: Backport for Turn on Parsoid read views by default on officewiki (T355566) (duration: 15m 19s)
  • 21:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P57637 and previous config saved to /var/cache/conftool/dbconfig/20240221-212546-arnaudb.json
  • 21:24 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 21:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 21:19 rzl@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 21:18 rzl@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 21:18 jhuneidi@deploy2002: cscott and jhuneidi: Continuing with sync
  • 21:17 rzl@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 21:17 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncmonitor1001.eqiad.wmnet with OS bookworm
  • 21:17 rzl@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 21:12 jhuneidi@deploy2002: cscott and jhuneidi: Backport for Turn on Parsoid read views by default on officewiki (T355566) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:11 jhuneidi@deploy2002: Started scap: Backport for Turn on Parsoid read views by default on officewiki (T355566)
  • 21:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T357189)', diff saved to https://phabricator.wikimedia.org/P57636 and previous config saved to /var/cache/conftool/dbconfig/20240221-211039-arnaudb.json
  • 21:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T357189)', diff saved to https://phabricator.wikimedia.org/P57635 and previous config saved to /var/cache/conftool/dbconfig/20240221-210001-arnaudb.json
  • 20:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T357189)', diff saved to https://phabricator.wikimedia.org/P57634 and previous config saved to /var/cache/conftool/dbconfig/20240221-205922-arnaudb.json
  • 20:54 jhuneidi@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.19 refs T354437 (duration: 08m 35s)
  • 20:46 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.19 refs T354437
  • 20:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P57633 and previous config saved to /var/cache/conftool/dbconfig/20240221-204415-arnaudb.json
  • 20:39 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:39 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 ejegg: turned off nightly recurring charge job for Autorescue deployment
  • 20:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P57632 and previous config saved to /var/cache/conftool/dbconfig/20240221-202906-arnaudb.json
  • 20:16 jhuneidi@deploy2002: scap failed: average error rate on 4/4 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)
  • 20:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T357189)', diff saved to https://phabricator.wikimedia.org/P57631 and previous config saved to /var/cache/conftool/dbconfig/20240221-201400-arnaudb.json
  • 20:11 jhuneidi@deploy2002: Finished scap: Backport for CentralAuthHooks::onGetUserBlock: Only run for reg. users (T358112) (duration: 14m 09s)
  • 20:03 jhuneidi@deploy2002: jhuneidi and matmarex: Continuing with sync
  • 20:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T357189)', diff saved to https://phabricator.wikimedia.org/P57630 and previous config saved to /var/cache/conftool/dbconfig/20240221-200209-arnaudb.json
  • 20:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 20:02 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 20:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T357189)', diff saved to https://phabricator.wikimedia.org/P57629 and previous config saved to /var/cache/conftool/dbconfig/20240221-200148-arnaudb.json
  • 19:58 jhuneidi@deploy2002: jhuneidi and matmarex: Backport for CentralAuthHooks::onGetUserBlock: Only run for reg. users (T358112) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:57 jhuneidi@deploy2002: Started scap: Backport for CentralAuthHooks::onGetUserBlock: Only run for reg. users (T358112)
  • 19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T355609)', diff saved to https://phabricator.wikimedia.org/P57628 and previous config saved to /var/cache/conftool/dbconfig/20240221-195157-marostegui.json
  • 19:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P57627 and previous config saved to /var/cache/conftool/dbconfig/20240221-194641-arnaudb.json
  • 19:38 inflatador: bking@deploy2002 deleting old flink data from thanos-swift T348685
  • 19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P57626 and previous config saved to /var/cache/conftool/dbconfig/20240221-193650-marostegui.json
  • 19:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P57625 and previous config saved to /var/cache/conftool/dbconfig/20240221-193135-arnaudb.json
  • 19:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P57624 and previous config saved to /var/cache/conftool/dbconfig/20240221-192144-marostegui.json
  • 19:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T357189)', diff saved to https://phabricator.wikimedia.org/P57623 and previous config saved to /var/cache/conftool/dbconfig/20240221-191628-arnaudb.json
  • 19:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T355609)', diff saved to https://phabricator.wikimedia.org/P57622 and previous config saved to /var/cache/conftool/dbconfig/20240221-190637-marostegui.json
  • 19:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T357189)', diff saved to https://phabricator.wikimedia.org/P57621 and previous config saved to /var/cache/conftool/dbconfig/20240221-190311-arnaudb.json
  • 19:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 19:02 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 19:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T357189)', diff saved to https://phabricator.wikimedia.org/P57620 and previous config saved to /var/cache/conftool/dbconfig/20240221-190249-arnaudb.json
  • 18:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P57619 and previous config saved to /var/cache/conftool/dbconfig/20240221-184743-arnaudb.json
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T355609)', diff saved to https://phabricator.wikimedia.org/P57618 and previous config saved to /var/cache/conftool/dbconfig/20240221-184144-marostegui.json
  • 18:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 18:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T355609)', diff saved to https://phabricator.wikimedia.org/P57617 and previous config saved to /var/cache/conftool/dbconfig/20240221-184120-marostegui.json
  • 18:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P57616 and previous config saved to /var/cache/conftool/dbconfig/20240221-183236-arnaudb.json
  • 18:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P57615 and previous config saved to /var/cache/conftool/dbconfig/20240221-182614-marostegui.json
  • 18:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T357189)', diff saved to https://phabricator.wikimedia.org/P57614 and previous config saved to /var/cache/conftool/dbconfig/20240221-181729-arnaudb.json
  • 18:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P57613 and previous config saved to /var/cache/conftool/dbconfig/20240221-181107-marostegui.json
  • 18:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T357189)', diff saved to https://phabricator.wikimedia.org/P57612 and previous config saved to /var/cache/conftool/dbconfig/20240221-180103-arnaudb.json
  • 18:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 18:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 18:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T357189)', diff saved to https://phabricator.wikimedia.org/P57611 and previous config saved to /var/cache/conftool/dbconfig/20240221-180041-arnaudb.json
  • 17:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T355609)', diff saved to https://phabricator.wikimedia.org/P57610 and previous config saved to /var/cache/conftool/dbconfig/20240221-175601-marostegui.json
  • 17:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P57609 and previous config saved to /var/cache/conftool/dbconfig/20240221-174534-arnaudb.json
  • 17:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P57608 and previous config saved to /var/cache/conftool/dbconfig/20240221-173028-arnaudb.json
  • 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T355609)', diff saved to https://phabricator.wikimedia.org/P57607 and previous config saved to /var/cache/conftool/dbconfig/20240221-172731-marostegui.json
  • 17:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 17:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T355609)', diff saved to https://phabricator.wikimedia.org/P57606 and previous config saved to /var/cache/conftool/dbconfig/20240221-172709-marostegui.json
  • 17:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T357189)', diff saved to https://phabricator.wikimedia.org/P57605 and previous config saved to /var/cache/conftool/dbconfig/20240221-171521-arnaudb.json
  • 17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P57604 and previous config saved to /var/cache/conftool/dbconfig/20240221-171203-marostegui.json
  • 17:09 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bullseye
  • 17:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2120 (T357189)', diff saved to https://phabricator.wikimedia.org/P57603 and previous config saved to /var/cache/conftool/dbconfig/20240221-170157-arnaudb.json
  • 17:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 17:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 17:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T357189)', diff saved to https://phabricator.wikimedia.org/P57602 and previous config saved to /var/cache/conftool/dbconfig/20240221-170134-arnaudb.json
  • 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P57601 and previous config saved to /var/cache/conftool/dbconfig/20240221-165657-marostegui.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2106 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57600 and previous config saved to /var/cache/conftool/dbconfig/20240221-165651-arnaudb.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57599 and previous config saved to /var/cache/conftool/dbconfig/20240221-165644-arnaudb.json
  • 16:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P57598 and previous config saved to /var/cache/conftool/dbconfig/20240221-164628-arnaudb.json
  • 16:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T355609)', diff saved to https://phabricator.wikimedia.org/P57597 and previous config saved to /var/cache/conftool/dbconfig/20240221-164150-marostegui.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2106 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57596 and previous config saved to /var/cache/conftool/dbconfig/20240221-164146-arnaudb.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57595 and previous config saved to /var/cache/conftool/dbconfig/20240221-164140-arnaudb.json
  • 16:34 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57594 and previous config saved to /var/cache/conftool/dbconfig/20240221-163433-root.json
  • 16:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P57593 and previous config saved to /var/cache/conftool/dbconfig/20240221-163122-arnaudb.json
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2106 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57592 and previous config saved to /var/cache/conftool/dbconfig/20240221-162641-arnaudb.json
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57591 and previous config saved to /var/cache/conftool/dbconfig/20240221-162635-arnaudb.json
  • 16:25 claime: Uncordoning kubernetes2025.codfw.wmnet kubernetes2026.codfw.wmnet following codfw A8 network migration - T355874
  • 16:24 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=parse200(4|5).*
  • 16:24 claime: Repooling parse2004.codfw.wmnet parse2005.codfw.wmnet following codfw A8 network migration - T355874
  • 16:19 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57590 and previous config saved to /var/cache/conftool/dbconfig/20240221-161928-root.json
  • 16:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T357189)', diff saved to https://phabricator.wikimedia.org/P57589 and previous config saved to /var/cache/conftool/dbconfig/20240221-161615-arnaudb.json
  • 16:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T355609)', diff saved to https://phabricator.wikimedia.org/P57588 and previous config saved to /var/cache/conftool/dbconfig/20240221-161407-marostegui.json
  • 16:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 16:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 16:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T355609)', diff saved to https://phabricator.wikimedia.org/P57587 and previous config saved to /var/cache/conftool/dbconfig/20240221-161345-marostegui.json
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2106 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57586 and previous config saved to /var/cache/conftool/dbconfig/20240221-161136-arnaudb.json
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57585 and previous config saved to /var/cache/conftool/dbconfig/20240221-161129-arnaudb.json
  • 16:09 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2137.codfw.wmnet with OS bookworm
  • 16:06 jayme: imported prometheus-rsyslog-exporter 1.0.0+git20221110-1 to buster,bullseye,bookworm - T357616
  • 16:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2108 (T357189)', diff saved to https://phabricator.wikimedia.org/P57584 and previous config saved to /var/cache/conftool/dbconfig/20240221-160511-arnaudb.json
  • 16:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 16:05 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 16:04 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 16:04 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 16:04 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57583 and previous config saved to /var/cache/conftool/dbconfig/20240221-160423-root.json
  • 16:03 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 16:03 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 16:02 topranks: Commencing network maintenance migrating servers to new switch codfw rack A8 T355874
  • 15:59 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: Migrating servers in codfw rack A7 to lsw1-a7-codfw
  • 15:58 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 6 hosts with reason: Migrating servers in codfw rack A7 to lsw1-a7-codfw
  • 15:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P57582 and previous config saved to /var/cache/conftool/dbconfig/20240221-155839-marostegui.json
  • 15:58 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a8-codfw.mgmt with reason: prepping for server uplink migration codfw rack a8
  • 15:57 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a8-codfw.mgmt with reason: prepping for server uplink migration codfw rack a8
  • 15:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 15:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 15:55 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 15:54 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 15:52 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 15:51 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 15:49 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57581 and previous config saved to /var/cache/conftool/dbconfig/20240221-154918-root.json
  • 15:47 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2137.codfw.wmnet with reason: host reimage
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 15:44 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2137.codfw.wmnet with reason: host reimage
  • 15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P57580 and previous config saved to /var/cache/conftool/dbconfig/20240221-154333-marostegui.json
  • 15:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:25:00 on db2106.codfw.wmnet with reason: T355874 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:41 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:25:00 on db2106.codfw.wmnet with reason: T355874 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:25:00 on db2146.codfw.wmnet with reason: T355874 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:41 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:25:00 on db2146.codfw.wmnet with reason: T355874 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:40 arnaudb@cumin1002: dbctl commit (dc=all): 'T355874 - depooling db2146 db2106', diff saved to https://phabricator.wikimedia.org/P57579 and previous config saved to /var/cache/conftool/dbconfig/20240221-154056-arnaudb.json
  • 15:39 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 15:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 15:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T357189)', diff saved to https://phabricator.wikimedia.org/P57578 and previous config saved to /var/cache/conftool/dbconfig/20240221-153926-arnaudb.json
  • 15:34 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57577 and previous config saved to /var/cache/conftool/dbconfig/20240221-153414-root.json
  • 15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T355609)', diff saved to https://phabricator.wikimedia.org/P57576 and previous config saved to /var/cache/conftool/dbconfig/20240221-152826-marostegui.json
  • 15:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P57575 and previous config saved to /var/cache/conftool/dbconfig/20240221-152420-arnaudb.json
  • 15:21 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host db2137.codfw.wmnet with OS bookworm
  • 15:19 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57574 and previous config saved to /var/cache/conftool/dbconfig/20240221-151909-root.json
  • 15:12 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 15:12 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 15:10 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:10 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for private1-b-codfw - cmooney@cumin1002"
  • 14:55 TheresNoTime: UTC afternoon backport window done
  • 14:54 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 14:54 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:54 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for private1-a-codfw - cmooney@cumin1002"
  • 14:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T357189)', diff saved to https://phabricator.wikimedia.org/P57570 and previous config saved to /var/cache/conftool/dbconfig/20240221-145407-arnaudb.json
  • 14:53 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for private1-a-codfw - cmooney@cumin1002"
  • 14:53 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 14:52 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 14:49 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 14:48 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 14:47 TheresNoTime: [samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki hewikinews --fix #T349581
  • 14:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T357189)', diff saved to https://phabricator.wikimedia.org/P57569 and previous config saved to /var/cache/conftool/dbconfig/20240221-144702-arnaudb.json
  • 14:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 14:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 14:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T357189)', diff saved to https://phabricator.wikimedia.org/P57568 and previous config saved to /var/cache/conftool/dbconfig/20240221-144641-arnaudb.json
  • 14:46 samtar@deploy2002: Finished scap: Backport for cswiki, commonswiki, enwiki: Lift IP cap for WikiGap Editathon, mywiki: create portal and draft namespace (T352424) (duration: 20m 23s)
  • 14:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P57567 and previous config saved to /var/cache/conftool/dbconfig/20240221-144536-marostegui.json
  • 14:44 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2026.codfw.wmnet with reason: host reimage
  • 14:44 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:43 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 14:42 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2026.codfw.wmnet with reason: host reimage
  • 14:40 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1033.eqiad.wmnet with OS bookworm
  • 14:38 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:38 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for private1-a-codfw - cmooney@cumin1002"
  • 14:37 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for private1-a-codfw - cmooney@cumin1002"
  • 14:37 samtar@deploy2002: samtar and anzx: Continuing with sync
  • 14:34 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:33 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 14:33 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 14:33 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 14:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57566 and previous config saved to /var/cache/conftool/dbconfig/20240221-143239-arnaudb.json
  • 14:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P57565 and previous config saved to /var/cache/conftool/dbconfig/20240221-143133-arnaudb.json
  • 14:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57564 and previous config saved to /var/cache/conftool/dbconfig/20240221-143120-arnaudb.json
  • 14:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P57563 and previous config saved to /var/cache/conftool/dbconfig/20240221-143030-marostegui.json
  • 14:27 samtar@deploy2002: samtar and anzx: Backport for cswiki, commonswiki, enwiki: Lift IP cap for WikiGap Editathon, mywiki: create portal and draft namespace (T352424) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:26 samtar@deploy2002: Started scap: Backport for cswiki, commonswiki, enwiki: Lift IP cap for WikiGap Editathon, mywiki: create portal and draft namespace (T352424)
  • 14:24 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host es2026.codfw.wmnet with OS bookworm
  • 14:23 samtar@deploy2002: Finished scap: Backport for zhwiki: Create group ipblock-exempt-grantor (T357991) (duration: 11m 05s)
  • 14:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "new apt server in codfw - jmm@cumin2002 - T331613"
  • 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new apt server in codfw - jmm@cumin2002 - T331613"
  • 14:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57562 and previous config saved to /var/cache/conftool/dbconfig/20240221-141734-arnaudb.json
  • 14:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P57561 and previous config saved to /var/cache/conftool/dbconfig/20240221-141627-arnaudb.json
  • 14:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57560 and previous config saved to /var/cache/conftool/dbconfig/20240221-141615-arnaudb.json
  • 14:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T355609)', diff saved to https://phabricator.wikimedia.org/P57559 and previous config saved to /var/cache/conftool/dbconfig/20240221-141523-marostegui.json
  • 14:15 samtar@deploy2002: stang and samtar: Continuing with sync
  • 14:13 samtar@deploy2002: stang and samtar: Backport for zhwiki: Create group ipblock-exempt-grantor (T357991) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:12 samtar@deploy2002: Started scap: Backport for zhwiki: Create group ipblock-exempt-grantor (T357991)
  • 14:10 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1033.eqiad.wmnet with reason: host reimage
  • 14:08 claime: restarted ferm.service on kubernetes2055.codfw.wmnet mw2440.codfw.wmnet mw2297.codfw.wmnet kubernetes2016.codfw.wmnet - T354855
  • 14:07 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1033.eqiad.wmnet with reason: host reimage
  • 14:05 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host apt2002.wikimedia.org
  • 14:05 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host apt2002.wikimedia.org with OS bookworm
  • 14:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57558 and previous config saved to /var/cache/conftool/dbconfig/20240221-140229-arnaudb.json
  • 14:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T357189)', diff saved to https://phabricator.wikimedia.org/P57557 and previous config saved to /var/cache/conftool/dbconfig/20240221-140120-arnaudb.json
  • 14:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57556 and previous config saved to /var/cache/conftool/dbconfig/20240221-140110-arnaudb.json
  • 13:59 topranks: adding IRB anycast interface on private1-a-codfw vlan to lsw1-a4-codfw
  • 13:50 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1033.eqiad.wmnet with OS bookworm
  • 13:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T357189)', diff saved to https://phabricator.wikimedia.org/P57555 and previous config saved to /var/cache/conftool/dbconfig/20240221-135031-arnaudb.json
  • 13:50 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 13:50 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 13:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T357189)', diff saved to https://phabricator.wikimedia.org/P57554 and previous config saved to /var/cache/conftool/dbconfig/20240221-135009-arnaudb.json
  • 13:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57553 and previous config saved to /var/cache/conftool/dbconfig/20240221-134724-arnaudb.json
  • 13:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2142.codfw.wmnet
  • 13:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57552 and previous config saved to /var/cache/conftool/dbconfig/20240221-134605-arnaudb.json
  • 13:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1180.eqiad.wmnet
  • 13:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1213.eqiad.wmnet
  • 13:41 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2142.codfw.wmnet
  • 13:41 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1213.eqiad.wmnet
  • 13:40 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1180.eqiad.wmnet
  • 13:40 Dreamy_Jazz: Re-started MediaModeration scanning script using `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt` - See T351400
  • 13:40 arnaudb@cumin1002: dbctl commit (dc=all): 'T356240 - depooling db1180 db1213 db2142', diff saved to https://phabricator.wikimedia.org/P57551 and previous config saved to /var/cache/conftool/dbconfig/20240221-134015-arnaudb.json
  • 13:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db2142.codfw.wmnet,db[1180,1213].eqiad.wmnet with reason: Silence for reboot T356240
  • 13:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db2142.codfw.wmnet,db[1180,1213].eqiad.wmnet with reason: Silence for reboot T356240
  • 13:35 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P57550 and previous config saved to /var/cache/conftool/dbconfig/20240221-133503-arnaudb.json
  • 13:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apt2002.wikimedia.org with reason: host reimage
  • 13:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on apt2002.wikimedia.org with reason: host reimage
  • 13:22 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 13:22 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 13:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170 (T355609)', diff saved to https://phabricator.wikimedia.org/P57549 and previous config saved to /var/cache/conftool/dbconfig/20240221-132156-marostegui.json
  • 13:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 13:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 13:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T355609)', diff saved to https://phabricator.wikimedia.org/P57548 and previous config saved to /var/cache/conftool/dbconfig/20240221-132134-marostegui.json
  • 13:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P57547 and previous config saved to /var/cache/conftool/dbconfig/20240221-131957-arnaudb.json
  • 13:18 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host apt2002.wikimedia.org with OS bookworm
  • 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM apt2002.wikimedia.org - jmm@cumin2002"
  • 13:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM apt2002.wikimedia.org - jmm@cumin2002"
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) apt2002.wikimedia.org on all recursors
  • 13:14 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache apt2002.wikimedia.org on all recursors
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM apt2002.wikimedia.org - jmm@cumin2002"
  • 13:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM apt2002.wikimedia.org - jmm@cumin2002"
  • 13:11 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:11 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host apt2002.wikimedia.org
  • 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.resource-report (exit_code=0)
  • 13:11 jmm@cumin2002: START - Cookbook sre.ganeti.resource-report
  • 13:08 samtar@deploy2002: Finished scap: Backport for InitialiseSettings: Enable Edit Recovery on 3 projects (T355548) (duration: 14m 36s)
  • 13:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P57546 and previous config saved to /var/cache/conftool/dbconfig/20240221-130628-marostegui.json
  • 13:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T357189)', diff saved to https://phabricator.wikimedia.org/P57545 and previous config saved to /var/cache/conftool/dbconfig/20240221-130450-arnaudb.json
  • 13:03 aborrero@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "cloudvirt1033 - aborrero@cumin1002"
  • 13:02 aborrero@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "cloudvirt1033 - aborrero@cumin1002"
  • 13:00 samtar@deploy2002: samtar: Continuing with sync
  • 12:57 Daimona: T357007 Running mwscript /home/daimona/GenerateInvitationList.php --wiki=metawiki --listfile=/home/daimona/list.txt (same as current master)
  • 12:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T357189)', diff saved to https://phabricator.wikimedia.org/P57544 and previous config saved to /var/cache/conftool/dbconfig/20240221-125711-arnaudb.json
  • 12:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 12:56 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 12:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T357189)', diff saved to https://phabricator.wikimedia.org/P57543 and previous config saved to /var/cache/conftool/dbconfig/20240221-125648-arnaudb.json
  • 12:55 samtar@deploy2002: samtar: Backport for InitialiseSettings: Enable Edit Recovery on 3 projects (T355548) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:53 samtar@deploy2002: Started scap: Backport for InitialiseSettings: Enable Edit Recovery on 3 projects (T355548)
  • 12:52 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1033.eqiad.wmnet with OS bookworm
  • 12:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P57542 and previous config saved to /var/cache/conftool/dbconfig/20240221-125121-marostegui.json
  • 12:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P57541 and previous config saved to /var/cache/conftool/dbconfig/20240221-124142-arnaudb.json
  • 12:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T355609)', diff saved to https://phabricator.wikimedia.org/P57540 and previous config saved to /var/cache/conftool/dbconfig/20240221-123615-marostegui.json
  • 12:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2193 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57539 and previous config saved to /var/cache/conftool/dbconfig/20240221-123439-arnaudb.json
  • 12:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57538 and previous config saved to /var/cache/conftool/dbconfig/20240221-123423-arnaudb.json
  • 12:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2191 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57537 and previous config saved to /var/cache/conftool/dbconfig/20240221-123410-arnaudb.json
  • 12:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P57536 and previous config saved to /var/cache/conftool/dbconfig/20240221-122636-arnaudb.json
  • 12:24 akosiaris@cumin1002: conftool action : set/pooled=true; selector: dnsdisc=mw-parsoid,name=codfw
  • 12:24 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1033.eqiad.wmnet with reason: host reimage
  • 12:22 kart_: Updated cxserver to 2024-02-21-112101-production (T357769)
  • 12:21 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1033.eqiad.wmnet with reason: host reimage
  • 12:21 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2026.codfw.wmnet with OS bookworm
  • 12:20 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 12:20 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:20 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 12:20 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:20 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 12:20 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 12:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2193 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57535 and previous config saved to /var/cache/conftool/dbconfig/20240221-121934-arnaudb.json
  • 12:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57534 and previous config saved to /var/cache/conftool/dbconfig/20240221-121918-arnaudb.json
  • 12:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2191 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57533 and previous config saved to /var/cache/conftool/dbconfig/20240221-121906-arnaudb.json
  • 12:18 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 12:18 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 12:15 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:15 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:15 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2026.codfw.wmnet with OS bookworm
  • 12:15 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2026.codfw.wmnet with OS bookworm
  • 12:14 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 12:14 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 12:13 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:13 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:12 claime: mw-page-content-change-enrich: Switch to mw-api-int-async - T357785
  • 12:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T357189)', diff saved to https://phabricator.wikimedia.org/P57532 and previous config saved to /var/cache/conftool/dbconfig/20240221-121129-arnaudb.json
  • 12:10 akosiaris: restart pybal on lvs2013, lvs 1019 to pickup mw-parsoid service. T357392
  • 12:09 aborrero@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1033
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T355609)', diff saved to https://phabricator.wikimedia.org/P57531 and previous config saved to /var/cache/conftool/dbconfig/20240221-120949-marostegui.json
  • 12:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 12:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 12:09 aborrero@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1033
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T355609)', diff saved to https://phabricator.wikimedia.org/P57530 and previous config saved to /var/cache/conftool/dbconfig/20240221-120927-marostegui.json
  • 12:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2193 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57529 and previous config saved to /var/cache/conftool/dbconfig/20240221-120429-arnaudb.json
  • 12:05 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1033.eqiad.wmnet with OS bookworm
  • 12:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57528 and previous config saved to /var/cache/conftool/dbconfig/20240221-120414-arnaudb.json
  • 12:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2191 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57527 and previous config saved to /var/cache/conftool/dbconfig/20240221-120401-arnaudb.json
  • 12:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T357189)', diff saved to https://phabricator.wikimedia.org/P57526 and previous config saved to /var/cache/conftool/dbconfig/20240221-120345-arnaudb.json
  • 12:04 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2026.codfw.wmnet with OS bookworm
  • 12:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 12:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 12:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T357189)', diff saved to https://phabricator.wikimedia.org/P57525 and previous config saved to /var/cache/conftool/dbconfig/20240221-120324-arnaudb.json
  • 12:02 akosiaris: restart pybal on lvs2014 to pickup mw-parsoid service. T357392
  • 12:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2026 T358080', diff saved to https://phabricator.wikimedia.org/P57524 and previous config saved to /var/cache/conftool/dbconfig/20240221-120202-root.json
  • 12:01 akosiaris: restart pybal on lvs1020 to pickup mw-parsoid service. T357392
  • 12:00 marostegui@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57523 and previous config saved to /var/cache/conftool/dbconfig/20240221-120051-root.json
  • 11:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P57522 and previous config saved to /var/cache/conftool/dbconfig/20240221-115421-marostegui.json
  • 11:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2193 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57521 and previous config saved to /var/cache/conftool/dbconfig/20240221-114925-arnaudb.json
  • 11:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57520 and previous config saved to /var/cache/conftool/dbconfig/20240221-114909-arnaudb.json
  • 11:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2191 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57519 and previous config saved to /var/cache/conftool/dbconfig/20240221-114856-arnaudb.json
  • 11:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P57518 and previous config saved to /var/cache/conftool/dbconfig/20240221-114817-arnaudb.json
  • 11:45 marostegui@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57517 and previous config saved to /var/cache/conftool/dbconfig/20240221-114546-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P57516 and previous config saved to /var/cache/conftool/dbconfig/20240221-113914-marostegui.json
  • 11:36 volans@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:36 volans@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Added cassandra IPs for restbase10[34-42] - volans@cumin1002"
  • 11:35 volans@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Added cassandra IPs for restbase10[34-42] - volans@cumin1002"
  • 11:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P57515 and previous config saved to /var/cache/conftool/dbconfig/20240221-113311-arnaudb.json
  • 11:32 volans@cumin1002: START - Cookbook sre.dns.netbox
  • 11:32 volans@cumin1002: END (ERROR) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=97) generate netbox hiera data: "Added cassandra IPs for restbase10[34-42] - volans@cumin1002"
  • 11:32 volans@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Added cassandra IPs for restbase10[34-42] - volans@cumin1002"
  • 11:30 marostegui@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57514 and previous config saved to /var/cache/conftool/dbconfig/20240221-113041-root.json
  • 11:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T355609)', diff saved to https://phabricator.wikimedia.org/P57513 and previous config saved to /var/cache/conftool/dbconfig/20240221-112408-marostegui.json
  • 11:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T357189)', diff saved to https://phabricator.wikimedia.org/P57512 and previous config saved to /var/cache/conftool/dbconfig/20240221-111805-arnaudb.json
  • 11:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1151.eqiad.wmnet
  • 11:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2191.codfw.wmnet
  • 11:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2192.codfw.wmnet
  • 11:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2193.codfw.wmnet
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57511 and previous config saved to /var/cache/conftool/dbconfig/20240221-111536-root.json
  • 11:13 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57510 and previous config saved to /var/cache/conftool/dbconfig/20240221-111348-root.json
  • 11:13 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2192.codfw.wmnet
  • 11:12 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2193.codfw.wmnet
  • 11:12 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1151.eqiad.wmnet
  • 11:12 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2191.codfw.wmnet
  • 11:12 arnaudb@cumin1002: dbctl commit (dc=all): 'T356240 - depooling db2191 db2192 db2193 db1151', diff saved to https://phabricator.wikimedia.org/P57508 and previous config saved to /var/cache/conftool/dbconfig/20240221-111023-arnaudb.json
  • 11:11 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db[2191-2193].codfw.wmnet,db1151.eqiad.wmnet with reason: Silence for reboot T356240
  • 11:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T357189)', diff saved to https://phabricator.wikimedia.org/P57507 and previous config saved to /var/cache/conftool/dbconfig/20240221-111012-arnaudb.json
  • 11:11 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 11:11 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db[2191-2193].codfw.wmnet,db1151.eqiad.wmnet with reason: Silence for reboot T356240
  • 11:10 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 11:10 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 11:10 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 11:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T357189)', diff saved to https://phabricator.wikimedia.org/P57506 and previous config saved to /var/cache/conftool/dbconfig/20240221-110951-arnaudb.json
  • 11:09 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1001.eqiad.wmnet
  • 11:08 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 11:08 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 11:07 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 11:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1001.eqiad.wmnet
  • 11:05 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 11:05 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2002.codfw.wmnet
  • 11:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2002.codfw.wmnet
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57505 and previous config saved to /var/cache/conftool/dbconfig/20240221-110031-root.json
  • 10:58 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57504 and previous config saved to /var/cache/conftool/dbconfig/20240221-105844-root.json
  • 10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2146 (T355609)', diff saved to https://phabricator.wikimedia.org/P57503 and previous config saved to /var/cache/conftool/dbconfig/20240221-105654-marostegui.json
  • 10:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 10:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T355609)', diff saved to https://phabricator.wikimedia.org/P57502 and previous config saved to /var/cache/conftool/dbconfig/20240221-105630-marostegui.json
  • 10:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P57501 and previous config saved to /var/cache/conftool/dbconfig/20240221-105445-arnaudb.json
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57500 and previous config saved to /var/cache/conftool/dbconfig/20240221-104526-root.json
  • 10:43 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57499 and previous config saved to /var/cache/conftool/dbconfig/20240221-104339-root.json
  • 10:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P57498 and previous config saved to /var/cache/conftool/dbconfig/20240221-104124-marostegui.json
  • 10:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P57497 and previous config saved to /var/cache/conftool/dbconfig/20240221-103938-arnaudb.json
  • 10:37 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 10:36 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 10:36 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 10:35 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 10:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2031.codfw.wmnet with OS bookworm
  • 10:34 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 10:34 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 10:32 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 10:32 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 10:28 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57496 and previous config saved to /var/cache/conftool/dbconfig/20240221-102833-root.json
  • 10:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P57495 and previous config saved to /var/cache/conftool/dbconfig/20240221-102618-marostegui.json
  • 10:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T357189)', diff saved to https://phabricator.wikimedia.org/P57494 and previous config saved to /var/cache/conftool/dbconfig/20240221-102432-arnaudb.json
  • 10:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T357189)', diff saved to https://phabricator.wikimedia.org/P57493 and previous config saved to /var/cache/conftool/dbconfig/20240221-101646-arnaudb.json
  • 10:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 10:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 10:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2031.codfw.wmnet with reason: host reimage
  • 10:13 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57492 and previous config saved to /var/cache/conftool/dbconfig/20240221-101328-root.json
  • 10:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2031.codfw.wmnet with reason: host reimage
  • 10:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1003.eqiad.wmnet with OS bookworm
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T355609)', diff saved to https://phabricator.wikimedia.org/P57491 and previous config saved to /var/cache/conftool/dbconfig/20240221-101111-marostegui.json
  • 10:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 10:08 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 10:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T357189)', diff saved to https://phabricator.wikimedia.org/P57490 and previous config saved to /var/cache/conftool/dbconfig/20240221-100815-arnaudb.json
  • 09:58 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57489 and previous config saved to /var/cache/conftool/dbconfig/20240221-095823-root.json
  • 09:56 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1003.eqiad.wmnet with reason: host reimage
  • 09:53 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1003.eqiad.wmnet with reason: host reimage
  • 09:53 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2031.codfw.wmnet with OS bookworm
  • 09:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P57488 and previous config saved to /var/cache/conftool/dbconfig/20240221-095309-arnaudb.json
  • 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2031 T358080', diff saved to https://phabricator.wikimedia.org/P57487 and previous config saved to /var/cache/conftool/dbconfig/20240221-095205-root.json
  • 09:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T355609)', diff saved to https://phabricator.wikimedia.org/P57486 and previous config saved to /var/cache/conftool/dbconfig/20240221-094516-marostegui.json
  • 09:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 1%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57485 and previous config saved to /var/cache/conftool/dbconfig/20240221-094319-root.json
  • 09:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1030.eqiad.wmnet with OS bookworm
  • 09:40 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest1003.eqiad.wmnet with OS bookworm
  • 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P57484 and previous config saved to /var/cache/conftool/dbconfig/20240221-093802-arnaudb.json
  • 09:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1030.eqiad.wmnet with reason: host reimage
  • 09:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1030.eqiad.wmnet with reason: host reimage
  • 09:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 09:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T357189)', diff saved to https://phabricator.wikimedia.org/P57482 and previous config saved to /var/cache/conftool/dbconfig/20240221-092256-arnaudb.json
  • 09:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T355609)', diff saved to https://phabricator.wikimedia.org/P57481 and previous config saved to /var/cache/conftool/dbconfig/20240221-092251-marostegui.json
  • 09:15 arnaudb@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57480 and previous config saved to /var/cache/conftool/dbconfig/20240221-091531-arnaudb.json
  • 09:15 arnaudb@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57479 and previous config saved to /var/cache/conftool/dbconfig/20240221-091521-arnaudb.json
  • 09:15 arnaudb@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57478 and previous config saved to /var/cache/conftool/dbconfig/20240221-091509-arnaudb.json
  • 09:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57477 and previous config saved to /var/cache/conftool/dbconfig/20240221-091449-arnaudb.json
  • 09:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T357189)', diff saved to https://phabricator.wikimedia.org/P57476 and previous config saved to /var/cache/conftool/dbconfig/20240221-091358-arnaudb.json
  • 09:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 09:13 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 09:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T357189)', diff saved to https://phabricator.wikimedia.org/P57475 and previous config saved to /var/cache/conftool/dbconfig/20240221-091337-arnaudb.json
  • 09:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1030.eqiad.wmnet with OS bookworm
  • 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1030 T358080', diff saved to https://phabricator.wikimedia.org/P57474 and previous config saved to /var/cache/conftool/dbconfig/20240221-090957-root.json
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P57473 and previous config saved to /var/cache/conftool/dbconfig/20240221-090744-marostegui.json
  • 09:06 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s2
  • 09:06 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s7
  • 09:00 arnaudb@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57472 and previous config saved to /var/cache/conftool/dbconfig/20240221-090026-arnaudb.json
  • 09:00 arnaudb@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57471 and previous config saved to /var/cache/conftool/dbconfig/20240221-090016-arnaudb.json
  • 09:00 hashar: Restarted CI Jenkins on contint2002 to update the timestamper plugin
  • 09:00 arnaudb@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57470 and previous config saved to /var/cache/conftool/dbconfig/20240221-090004-arnaudb.json
  • 08:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57469 and previous config saved to /var/cache/conftool/dbconfig/20240221-085944-arnaudb.json
  • 08:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P57468 and previous config saved to /var/cache/conftool/dbconfig/20240221-085830-arnaudb.json
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P57467 and previous config saved to /var/cache/conftool/dbconfig/20240221-085238-marostegui.json
  • 08:45 arnaudb@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57466 and previous config saved to /var/cache/conftool/dbconfig/20240221-084521-arnaudb.json
  • 08:45 arnaudb@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57465 and previous config saved to /var/cache/conftool/dbconfig/20240221-084511-arnaudb.json
  • 08:45 arnaudb@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57464 and previous config saved to /var/cache/conftool/dbconfig/20240221-084459-arnaudb.json
  • 08:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57463 and previous config saved to /var/cache/conftool/dbconfig/20240221-084440-arnaudb.json
  • 08:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P57462 and previous config saved to /var/cache/conftool/dbconfig/20240221-084325-arnaudb.json
  • 08:43 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts sretest2005.codfw.wmnet
  • 08:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:41 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T355609)', diff saved to https://phabricator.wikimedia.org/P57461 and previous config saved to /var/cache/conftool/dbconfig/20240221-083731-marostegui.json
  • 08:36 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts sretest2005.codfw.wmnet
  • 08:30 arnaudb@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57460 and previous config saved to /var/cache/conftool/dbconfig/20240221-083016-arnaudb.json
  • 08:30 arnaudb@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57459 and previous config saved to /var/cache/conftool/dbconfig/20240221-083006-arnaudb.json
  • 08:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57458 and previous config saved to /var/cache/conftool/dbconfig/20240221-082955-arnaudb.json
  • 08:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57457 and previous config saved to /var/cache/conftool/dbconfig/20240221-082935-arnaudb.json
  • 08:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2180.codfw.wmnet
  • 08:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2190.codfw.wmnet
  • 08:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T357189)', diff saved to https://phabricator.wikimedia.org/P57456 and previous config saved to /var/cache/conftool/dbconfig/20240221-082818-arnaudb.json
  • 08:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2189.codfw.wmnet
  • 08:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2188.codfw.wmnet
  • 08:23 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2189.codfw.wmnet
  • 08:23 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2190.codfw.wmnet
  • 08:23 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2188.codfw.wmnet
  • 08:23 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2180.codfw.wmnet
  • 08:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 db2188 db2189 db2190 depool for T356240', diff saved to https://phabricator.wikimedia.org/P57455 and previous config saved to /var/cache/conftool/dbconfig/20240221-082219-arnaudb.json
  • 08:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db[2180,2188-2190].codfw.wmnet with reason: Silence for reboot T356240
  • 08:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db[2180,2188-2190].codfw.wmnet with reason: Silence for reboot T356240
  • 08:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T357189)', diff saved to https://phabricator.wikimedia.org/P57454 and previous config saved to /var/cache/conftool/dbconfig/20240221-082029-arnaudb.json
  • 08:20 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:20 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:20 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 08:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2130 (T355609)', diff saved to https://phabricator.wikimedia.org/P57452 and previous config saved to /var/cache/conftool/dbconfig/20240221-080836-marostegui.json
  • 08:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 08:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T355609)', diff saved to https://phabricator.wikimedia.org/P57451 and previous config saved to /var/cache/conftool/dbconfig/20240221-080814-marostegui.json
  • 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P57450 and previous config saved to /var/cache/conftool/dbconfig/20240221-075307-marostegui.json
  • 07:44 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57449 and previous config saved to /var/cache/conftool/dbconfig/20240221-074452-root.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P57448 and previous config saved to /var/cache/conftool/dbconfig/20240221-073801-marostegui.json
  • 07:29 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57447 and previous config saved to /var/cache/conftool/dbconfig/20240221-072948-root.json
  • 07:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T355609)', diff saved to https://phabricator.wikimedia.org/P57446 and previous config saved to /var/cache/conftool/dbconfig/20240221-072255-marostegui.json
  • 07:14 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57445 and previous config saved to /var/cache/conftool/dbconfig/20240221-071443-root.json
  • 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57444 and previous config saved to /var/cache/conftool/dbconfig/20240221-065938-root.json
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T355609)', diff saved to https://phabricator.wikimedia.org/P57443 and previous config saved to /var/cache/conftool/dbconfig/20240221-065508-marostegui.json
  • 06:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 06:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T355609)', diff saved to https://phabricator.wikimedia.org/P57442 and previous config saved to /var/cache/conftool/dbconfig/20240221-065447-marostegui.json
  • 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57441 and previous config saved to /var/cache/conftool/dbconfig/20240221-064433-root.json
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P57440 and previous config saved to /var/cache/conftool/dbconfig/20240221-063940-marostegui.json
  • 06:29 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57439 and previous config saved to /var/cache/conftool/dbconfig/20240221-062928-root.json
  • 06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P57438 and previous config saved to /var/cache/conftool/dbconfig/20240221-062434-marostegui.json
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 1%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57437 and previous config saved to /var/cache/conftool/dbconfig/20240221-061325-root.json
  • 06:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1026.eqiad.wmnet with OS bookworm
  • 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T355609)', diff saved to https://phabricator.wikimedia.org/P57436 and previous config saved to /var/cache/conftool/dbconfig/20240221-060928-marostegui.json
  • 05:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1026.eqiad.wmnet with reason: host reimage
  • 05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1026.eqiad.wmnet with reason: host reimage
  • 05:45 kart_: Updated MinT to 2024-02-20-062448-production (T333969, T354666)
  • 05:42 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
  • 05:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2103 (T355609)', diff saved to https://phabricator.wikimedia.org/P57435 and previous config saved to /var/cache/conftool/dbconfig/20240221-054136-marostegui.json
  • 05:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 05:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 05:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1026.eqiad.wmnet with OS bookworm
  • 05:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1026 T358080', diff saved to https://phabricator.wikimedia.org/P57434 and previous config saved to /var/cache/conftool/dbconfig/20240221-053822-root.json
  • 05:33 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
  • 05:21 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s7
  • 05:21 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s2
  • 05:21 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
  • 05:14 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
  • 05:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 05:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 05:13 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 05:09 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 05:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2220.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2219.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2217.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2218.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2216.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2220.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:41 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:41 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2220 to codfw - jhancock@cumin2002"
  • 04:41 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2220 to codfw - jhancock@cumin2002"
  • 04:39 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2215.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:36 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2219.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:35 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:35 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2219 to codfw - jhancock@cumin2002"
  • 04:34 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2219 to codfw - jhancock@cumin2002"
  • 04:32 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:31 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 04:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2218.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:30 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:30 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2218 to codfw - jhancock@cumin2002"
  • 04:30 rzl@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 04:29 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2218 to codfw - jhancock@cumin2002"
  • 04:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:25 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2217.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:25 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:25 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2217 to codfw - jhancock@cumin2002"
  • 04:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2217 to codfw - jhancock@cumin2002"
  • 04:23 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2214.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:22 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:21 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2216.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2213.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:20 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:20 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2216 to codfw - jhancock@cumin2002"
  • 04:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2216 to codfw - jhancock@cumin2002"
  • 04:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2212.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:18 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2215.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:15 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:15 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2215 to codfw - jhancock@cumin2002"
  • 04:14 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2215 to codfw - jhancock@cumin2002"
  • 04:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2211.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:12 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2214.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:09 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2214 to codfw - jhancock@cumin2002"
  • 04:08 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2214 to codfw - jhancock@cumin2002"
  • 04:06 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:00 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:00 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2213 to codfw - jhancock@cumin2002"
  • 03:59 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2213 to codfw - jhancock@cumin2002"
  • 03:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2212.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:57 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 03:56 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 03:56 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2212 to codfw - jhancock@cumin2002"
  • 03:55 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2212 to codfw - jhancock@cumin2002"
  • 03:55 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 03:54 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 03:53 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 03:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2209.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:52 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2211.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2210.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 03:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2211 to codfw - jhancock@cumin2002"
  • 03:41 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2211 to codfw - jhancock@cumin2002"
  • 03:39 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 03:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2210.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:36 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 03:36 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2210 to codfw - jhancock@cumin2002"
  • 03:35 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2210 to codfw - jhancock@cumin2002"
  • 03:33 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 03:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2209.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:30 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 03:30 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2209 to codfw - jhancock@cumin2002"
  • 03:29 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2209 to codfw - jhancock@cumin2002"
  • 03:29 rzl@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 03:28 rzl@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 03:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 03:26 rzl@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 03:26 rzl@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 03:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2208.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2206.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:15 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2207.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2208.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:01 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2207.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2208.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2207.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:00 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2208.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:00 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2207.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:00 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2206.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:59 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 02:59 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2206 to codfw - jhancock@cumin2002"
  • 02:58 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2206 to codfw - jhancock@cumin2002"
  • 02:56 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 02:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2204.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2205.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:29 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2204.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2205.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:23 rzl@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 02:22 rzl@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 02:20 rzl@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 02:20 rzl@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 02:11 rzl@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 02:10 rzl@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 00:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 00:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 00:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1240.eqiad.wmnet with reason: Maintenance

2024-02-20

  • 23:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 23:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 23:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 23:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T355609)', diff saved to https://phabricator.wikimedia.org/P57433 and previous config saved to /var/cache/conftool/dbconfig/20240220-233832-marostegui.json
  • 23:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P57432 and previous config saved to /var/cache/conftool/dbconfig/20240220-232326-marostegui.json
  • 23:23 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P57431 and previous config saved to /var/cache/conftool/dbconfig/20240220-230817-marostegui.json
  • 22:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T355609)', diff saved to https://phabricator.wikimedia.org/P57430 and previous config saved to /var/cache/conftool/dbconfig/20240220-225311-marostegui.json
  • 22:52 sfaci: Deployed refinery using scap, then deployed onto hdfs
  • 22:39 sfaci@deploy2002: Finished deploy [analytics/refinery@d078656] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d0786561] (duration: 03m 29s)
  • 22:36 sfaci@deploy2002: Started deploy [analytics/refinery@d078656] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d0786561]
  • 22:36 sfaci@deploy2002: Finished deploy [analytics/refinery@d078656] (thin): Regular analytics weekly train THIN [analytics/refinery@d0786561] (duration: 00m 05s)
  • 22:35 sfaci@deploy2002: Started deploy [analytics/refinery@d078656] (thin): Regular analytics weekly train THIN [analytics/refinery@d0786561]
  • 22:35 sfaci@deploy2002: Finished deploy [analytics/refinery@d078656]: Regular analytics weekly train [analytics/refinery@d0786561] (duration: 00m 21s)
  • 22:35 sfaci@deploy2002: Started deploy [analytics/refinery@d078656]: Regular analytics weekly train [analytics/refinery@d0786561]
  • 22:34 sfaci@deploy2002: Finished deploy [analytics/refinery@d078656]: Regular analytics weekly train [analytics/refinery@d0786561] (duration: 13m 19s)
  • 22:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1235 (T355609)', diff saved to https://phabricator.wikimedia.org/P57429 and previous config saved to /var/cache/conftool/dbconfig/20240220-222445-marostegui.json
  • 22:24 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 22:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 22:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T355609)', diff saved to https://phabricator.wikimedia.org/P57428 and previous config saved to /var/cache/conftool/dbconfig/20240220-222423-marostegui.json
  • 22:20 sfaci@deploy2002: Started deploy [analytics/refinery@d078656]: Regular analytics weekly train [analytics/refinery@d0786561]
  • 22:18 sfaci: Starting refinery deployment
  • 22:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P57427 and previous config saved to /var/cache/conftool/dbconfig/20240220-220917-marostegui.json
  • 22:00 cjming: end of UTC late backport window
  • 21:58 cjming@deploy2002: Finished scap: Backport for Fix for regression in audio track suppression logic (T357942), Fix for regression in audio track suppression logic (T357942) (duration: 09m 24s)
  • 21:56 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:56 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P57426 and previous config saved to /var/cache/conftool/dbconfig/20240220-215410-marostegui.json
  • 21:51 cjming@deploy2002: brion and cjming: Continuing with sync
  • 21:50 cjming@deploy2002: brion and cjming: Backport for Fix for regression in audio track suppression logic (T357942), Fix for regression in audio track suppression logic (T357942) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:49 cjming@deploy2002: Started scap: Backport for Fix for regression in audio track suppression logic (T357942), Fix for regression in audio track suppression logic (T357942)
  • 21:48 cjming@deploy2002: Finished scap: Backport for Enable night mode on mobile test servers (T357759) (duration: 11m 01s)
  • 21:48 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:48 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:47 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:47 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:47 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:47 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:42 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:42 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:40 cjming@deploy2002: cjming and jdlrobson: Continuing with sync
  • 21:39 cjming@deploy2002: cjming and jdlrobson: Backport for Enable night mode on mobile test servers (T357759) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T355609)', diff saved to https://phabricator.wikimedia.org/P57424 and previous config saved to /var/cache/conftool/dbconfig/20240220-213904-marostegui.json
  • 21:37 cjming@deploy2002: Started scap: Backport for Enable night mode on mobile test servers (T357759)
  • 21:35 cjming@deploy2002: Finished scap: Backport for Enable desktop diff for anonymous users on enwiki (T350181) (duration: 13m 19s)
  • 21:30 ryankemper@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 21:28 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:28 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:27 cjming@deploy2002: jdlrobson and cjming: Continuing with sync
  • 21:24 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:24 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:23 cjming@deploy2002: jdlrobson and cjming: Backport for Enable desktop diff for anonymous users on enwiki (T350181) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:22 cjming@deploy2002: Started scap: Backport for Enable desktop diff for anonymous users on enwiki (T350181)
  • 21:20 cjming@deploy2002: Finished scap: Backport for Correctly turn on Parsoid read views by default on wikitech Talk pages (duration: 12m 53s)
  • 21:11 cjming@deploy2002: cscott and cjming: Continuing with sync
  • 21:08 cjming@deploy2002: cscott and cjming: Backport for Correctly turn on Parsoid read views by default on wikitech Talk pages synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T355609)', diff saved to https://phabricator.wikimedia.org/P57423 and previous config saved to /var/cache/conftool/dbconfig/20240220-210840-marostegui.json
  • 21:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 21:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 21:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T355609)', diff saved to https://phabricator.wikimedia.org/P57422 and previous config saved to /var/cache/conftool/dbconfig/20240220-210819-marostegui.json
  • 21:07 cjming@deploy2002: Started scap: Backport for Correctly turn on Parsoid read views by default on wikitech Talk pages
  • 21:04 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:04 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:56 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:56 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:56 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:55 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:55 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 20:55 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 20:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P57421 and previous config saved to /var/cache/conftool/dbconfig/20240220-205312-marostegui.json
  • 20:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P57420 and previous config saved to /var/cache/conftool/dbconfig/20240220-203806-marostegui.json
  • 20:35 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 20:35 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 20:32 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudelastic[1001-1004].wikimedia.org
  • 20:32 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:32 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic[1001-1004].wikimedia.org decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
  • 20:31 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:31 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:30 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic[1001-1004].wikimedia.org decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
  • 20:27 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
  • 20:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T355609)', diff saved to https://phabricator.wikimedia.org/P57419 and previous config saved to /var/cache/conftool/dbconfig/20240220-202300-marostegui.json
  • 20:01 ryankemper@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic[1001-1004].wikimedia.org
  • 19:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T355609)', diff saved to https://phabricator.wikimedia.org/P57417 and previous config saved to /var/cache/conftool/dbconfig/20240220-195303-marostegui.json
  • 19:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 19:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 19:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T355609)', diff saved to https://phabricator.wikimedia.org/P57416 and previous config saved to /var/cache/conftool/dbconfig/20240220-195242-marostegui.json
  • 19:48 ryankemper@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T347624, testing 961878 patch) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:48 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, testing 961878 patch) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:43 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
  • 19:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T357189)', diff saved to https://phabricator.wikimedia.org/P57415 and previous config saved to /var/cache/conftool/dbconfig/20240220-193842-arnaudb.json
  • 19:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P57414 and previous config saved to /var/cache/conftool/dbconfig/20240220-193735-marostegui.json
  • 19:36 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 19:35 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 19:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P57413 and previous config saved to /var/cache/conftool/dbconfig/20240220-192335-arnaudb.json
  • 19:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P57412 and previous config saved to /var/cache/conftool/dbconfig/20240220-192229-marostegui.json
  • 19:12 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.19 refs T354437
  • 19:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P57411 and previous config saved to /var/cache/conftool/dbconfig/20240220-190829-arnaudb.json
  • 19:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T355609)', diff saved to https://phabricator.wikimedia.org/P57410 and previous config saved to /var/cache/conftool/dbconfig/20240220-190722-marostegui.json
  • 18:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T357189)', diff saved to https://phabricator.wikimedia.org/P57409 and previous config saved to /var/cache/conftool/dbconfig/20240220-185322-arnaudb.json
  • 18:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2192 (T357189)', diff saved to https://phabricator.wikimedia.org/P57408 and previous config saved to /var/cache/conftool/dbconfig/20240220-184925-arnaudb.json
  • 18:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 18:49 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 18:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T357189)', diff saved to https://phabricator.wikimedia.org/P57407 and previous config saved to /var/cache/conftool/dbconfig/20240220-184903-arnaudb.json
  • 18:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1228 (T355609)', diff saved to https://phabricator.wikimedia.org/P57406 and previous config saved to /var/cache/conftool/dbconfig/20240220-184157-marostegui.json
  • 18:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 18:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T355609)', diff saved to https://phabricator.wikimedia.org/P57405 and previous config saved to /var/cache/conftool/dbconfig/20240220-184124-marostegui.json
  • 18:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P57404 and previous config saved to /var/cache/conftool/dbconfig/20240220-183356-arnaudb.json
  • 18:31 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4052.ulsfo.wmnet,service=(cdn|ats-be)
  • 18:31 sukhe: pool cp4052: bookworm cp host with haproxy 2.6 built against OpenSSL 1.1.1: T352744
  • 18:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P57403 and previous config saved to /var/cache/conftool/dbconfig/20240220-182617-marostegui.json
  • 18:22 sukhe: reprepro -C component/haproxy26 include bookworm-wikimedia haproxy_2.6.16-1~bpo12+1_amd64.changes: T352744
  • 18:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P57402 and previous config saved to /var/cache/conftool/dbconfig/20240220-181850-arnaudb.json
  • 18:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P57401 and previous config saved to /var/cache/conftool/dbconfig/20240220-181111-marostegui.json
  • 18:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T357189)', diff saved to https://phabricator.wikimedia.org/P57400 and previous config saved to /var/cache/conftool/dbconfig/20240220-180342-arnaudb.json
  • 17:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T357189)', diff saved to https://phabricator.wikimedia.org/P57399 and previous config saved to /var/cache/conftool/dbconfig/20240220-175938-arnaudb.json
  • 17:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 17:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 17:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T357189)', diff saved to https://phabricator.wikimedia.org/P57398 and previous config saved to /var/cache/conftool/dbconfig/20240220-175917-arnaudb.json
  • 17:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T355609)', diff saved to https://phabricator.wikimedia.org/P57397 and previous config saved to /var/cache/conftool/dbconfig/20240220-175605-marostegui.json
  • 17:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P57396 and previous config saved to /var/cache/conftool/dbconfig/20240220-174411-arnaudb.json
  • 17:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P57395 and previous config saved to /var/cache/conftool/dbconfig/20240220-172904-arnaudb.json
  • 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T355609)', diff saved to https://phabricator.wikimedia.org/P57394 and previous config saved to /var/cache/conftool/dbconfig/20240220-172716-marostegui.json
  • 17:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 17:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 17:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T355609)', diff saved to https://phabricator.wikimedia.org/P57393 and previous config saved to /var/cache/conftool/dbconfig/20240220-172653-marostegui.json
  • 17:18 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS bookworm
  • 17:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T357189)', diff saved to https://phabricator.wikimedia.org/P57392 and previous config saved to /var/cache/conftool/dbconfig/20240220-171358-arnaudb.json
  • 17:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P57391 and previous config saved to /var/cache/conftool/dbconfig/20240220-171147-marostegui.json
  • 17:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2171 (T357189)', diff saved to https://phabricator.wikimedia.org/P57390 and previous config saved to /var/cache/conftool/dbconfig/20240220-170949-arnaudb.json
  • 17:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 17:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 17:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T357189)', diff saved to https://phabricator.wikimedia.org/P57389 and previous config saved to /var/cache/conftool/dbconfig/20240220-170928-arnaudb.json
  • 16:57 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
  • 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P57388 and previous config saved to /var/cache/conftool/dbconfig/20240220-165641-marostegui.json
  • 16:55 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
  • 16:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P57387 and previous config saved to /var/cache/conftool/dbconfig/20240220-165421-arnaudb.json
  • 16:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2005.codfw.wmnet with OS bookworm
  • 16:43 brett@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,name=cp20(29|30).codfw.wmnet
  • 16:42 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp[2029-2030].codfw.wmnet
  • 16:42 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp[2029-2030].codfw.wmnet
  • 16:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T355609)', diff saved to https://phabricator.wikimedia.org/P57386 and previous config saved to /var/cache/conftool/dbconfig/20240220-164134-marostegui.json
  • 16:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P57385 and previous config saved to /var/cache/conftool/dbconfig/20240220-163915-arnaudb.json
  • 16:35 reedy@deploy2002: Synchronized php-1.42.0-wmf.19/extensions/AntiSpoof/: T357995 (duration: 11m 02s)
  • 16:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db1233 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57384 and previous config saved to /var/cache/conftool/dbconfig/20240220-163451-arnaudb.json
  • 16:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57383 and previous config saved to /var/cache/conftool/dbconfig/20240220-163447-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57382 and previous config saved to /var/cache/conftool/dbconfig/20240220-163447-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 100%: maintenance done', diff saved to https://phabricator.wikimedia.org/P57381 and previous config saved to /var/cache/conftool/dbconfig/20240220-163442-arnaudb.json
  • 16:30 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
  • 16:29 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS bookworm
  • 16:27 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4052.ulsfo.wmnet with OS bookworm
  • 16:24 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T357189)', diff saved to https://phabricator.wikimedia.org/P57380 and previous config saved to /var/cache/conftool/dbconfig/20240220-162408-arnaudb.json
  • 16:21 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
  • 16:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T357189)', diff saved to https://phabricator.wikimedia.org/P57379 and previous config saved to /var/cache/conftool/dbconfig/20240220-161953-arnaudb.json
  • 16:20 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 16:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db1233 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57378 and previous config saved to /var/cache/conftool/dbconfig/20240220-161946-arnaudb.json
  • 16:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57377 and previous config saved to /var/cache/conftool/dbconfig/20240220-161942-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57376 and previous config saved to /var/cache/conftool/dbconfig/20240220-161942-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 75%: maintenance done', diff saved to https://phabricator.wikimedia.org/P57375 and previous config saved to /var/cache/conftool/dbconfig/20240220-161937-arnaudb.json
  • 16:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T357189)', diff saved to https://phabricator.wikimedia.org/P57374 and previous config saved to /var/cache/conftool/dbconfig/20240220-161931-arnaudb.json
  • 16:18 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS bookworm
  • 16:14 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
  • 16:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T355609)', diff saved to https://phabricator.wikimedia.org/P57373 and previous config saved to /var/cache/conftool/dbconfig/20240220-161348-marostegui.json
  • 16:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 16:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 16:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T355609)', diff saved to https://phabricator.wikimedia.org/P57372 and previous config saved to /var/cache/conftool/dbconfig/20240220-161326-marostegui.json
  • 16:12 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
  • 16:11 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw
  • 16:11 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw
  • 16:09 hnowlan@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2312.codfw.wmnet|mw2313.codfw.wmnet|mw2367.codfw.wmnet|mw2369.codfw.wmnet)
  • 16:07 topranks: Commencing network maintenance migrating servers to new switch codfw rack A7 T355867
  • 16:06 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 22 hosts with reason: Migrating servers in codfw rack A7 to lsw1-a7-codfw
  • 16:06 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 22 hosts with reason: Migrating servers in codfw rack A7 to lsw1-a7-codfw
  • 16:05 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
  • 16:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57371 and previous config saved to /var/cache/conftool/dbconfig/20240220-160438-arnaudb.json
  • 16:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57370 and previous config saved to /var/cache/conftool/dbconfig/20240220-160437-arnaudb.json
  • 16:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 50%: maintenance done', diff saved to https://phabricator.wikimedia.org/P57369 and previous config saved to /var/cache/conftool/dbconfig/20240220-160432-arnaudb.json
  • 16:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db1233 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57368 and previous config saved to /var/cache/conftool/dbconfig/20240220-160429-arnaudb.json
  • 16:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P57367 and previous config saved to /var/cache/conftool/dbconfig/20240220-160423-arnaudb.json
  • 16:02 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a7-codfw.mgmt with reason: prepping for server uplink migration codfw rack a7
  • 16:02 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a7-codfw.mgmt with reason: prepping for server uplink migration codfw rack a7
  • 16:02 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2005.codfw.wmnet with reason: host reimage
  • 16:00 hnowlan: running `homer 'cr*codfw*' commit 'T351074'` for new k8s workers
  • 16:00 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic2089*,elastic2062*,elastic2061* for switch maintenance - bking@cumin2002 - T355860
  • 16:00 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic2089*,elastic2062*,elastic2061* for switch maintenance - bking@cumin2002 - T355860
  • 15:59 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2005.codfw.wmnet with reason: host reimage
  • 15:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P57366 and previous config saved to /var/cache/conftool/dbconfig/20240220-155820-marostegui.json
  • 15:55 xcollazo@deploy2002: Finished deploy [airflow-dags/analytics@b115452]: (no justification provided) (duration: 00m 34s)
  • 15:55 Emperor: import ceph-reef packages to apt1001 T279621
  • 15:55 xcollazo@deploy2002: Started deploy [airflow-dags/analytics@b115452]: (no justification provided)
  • 15:54 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 15:53 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 15:53 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 15:50 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 15:50 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 15:49 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db1233 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57365 and previous config saved to /var/cache/conftool/dbconfig/20240220-154924-arnaudb.json
  • 15:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57364 and previous config saved to /var/cache/conftool/dbconfig/20240220-154920-arnaudb.json
  • 15:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57363 and previous config saved to /var/cache/conftool/dbconfig/20240220-154920-arnaudb.json
  • 15:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P57362 and previous config saved to /var/cache/conftool/dbconfig/20240220-154917-arnaudb.json
  • 15:46 denisse: When doing the alert hosts upgrade we encountered some issues that prevented us to properly reimage the hosts to proceed with the upgrade. We're investigating this issue and inform of the new alert hosts upgrade date ASAP. - T333615
  • 15:46 denisse: When doing the alert hosts upgrade we encountered some issues that prevented us to properly reimage the hosts to proceed with the upgrade. We're investigating this issue and inform of the new alert hosts upgrade date ASAP. - T333615
  • 15:46 godog: re-enable meta-monitoring on wikitech-static.w.o - T333615
  • 15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P57361 and previous config saved to /var/cache/conftool/dbconfig/20240220-154313-marostegui.json
  • 15:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1233.eqiad.wmnet
  • 15:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1168.eqiad.wmnet
  • 15:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1226.eqiad.wmnet
  • 15:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1210.eqiad.wmnet
  • 15:37 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1233.eqiad.wmnet
  • 15:37 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1226.eqiad.wmnet
  • 15:37 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1210.eqiad.wmnet
  • 15:36 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1168.eqiad.wmnet
  • 15:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db1168 db1210 db1226 db1233 depool for T356240', diff saved to https://phabricator.wikimedia.org/P57359 and previous config saved to /var/cache/conftool/dbconfig/20240220-153557-arnaudb.json
  • 15:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T357189)', diff saved to https://phabricator.wikimedia.org/P57358 and previous config saved to /var/cache/conftool/dbconfig/20240220-153410-arnaudb.json
  • 15:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db[1168,1210,1226,1233].eqiad.wmnet with reason: Silence for reboot T356240
  • 15:33 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db[1168,1210,1226,1233].eqiad.wmnet with reason: Silence for reboot T356240
  • 15:32 godog: temp disable meta-monitoring on wikitech-static.w.o - T333615
  • 15:30 Emperor: import ceph-reef packages to apt1001 T279621
  • 15:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T357189)', diff saved to https://phabricator.wikimedia.org/P57357 and previous config saved to /var/cache/conftool/dbconfig/20240220-153000-arnaudb.json
  • 15:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 15:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 15:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 15:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 15:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T357189)', diff saved to https://phabricator.wikimedia.org/P57356 and previous config saved to /var/cache/conftool/dbconfig/20240220-152933-arnaudb.json
  • 15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T355609)', diff saved to https://phabricator.wikimedia.org/P57355 and previous config saved to /var/cache/conftool/dbconfig/20240220-152807-marostegui.json
  • 15:25 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 15:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 100%: After migration', diff saved to https://phabricator.wikimedia.org/P57354 and previous config saved to /var/cache/conftool/dbconfig/20240220-151812-root.json
  • 15:16 dcausse: depooled wdqs2009 & wdqs2020 (T355867)
  • 15:16 denisse_: starting the Alert hosts upgrade to Bookworm - T333615
  • 15:16 denisse_: starting the Alert hosts upgrade to Bookworm - T333615
  • 15:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P57353 and previous config saved to /var/cache/conftool/dbconfig/20240220-151426-arnaudb.json
  • 15:13 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
  • 15:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db[2146,2151].codfw.wmnet
  • 14:55 bking@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:55 bking@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57346 and previous config saved to /var/cache/conftool/dbconfig/20240220-145124-root.json
  • 14:50 bking@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:49 sukhe: disable puppet on A:cp to merge CR 1004126
  • 14:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp[2029-2030].codfw.wmnet with reason: T355867
  • 14:49 bking@deploy2002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:49 brett@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on cp[2029-2030].codfw.wmnet with reason: T355867
  • 14:48 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1231.eqiad.wmnet
  • 14:48 bking@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:48 bking@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 50%: After migration', diff saved to https://phabricator.wikimedia.org/P57345 and previous config saved to /var/cache/conftool/dbconfig/20240220-144803-root.json
  • 14:48 bking@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:48 brett@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,name=cp20(29|30).codfw.wmnet
  • 14:48 bking@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57344 and previous config saved to /var/cache/conftool/dbconfig/20240220-144753-root.json
  • 14:46 sukhe: updating pdns-recursor to 4.8.6-1 on dns*
  • 14:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P57343 and previous config saved to /var/cache/conftool/dbconfig/20240220-144539-marostegui.json
  • 14:44 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1231.eqiad.wmnet
  • 14:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T357189)', diff saved to https://phabricator.wikimedia.org/P57342 and previous config saved to /var/cache/conftool/dbconfig/20240220-144414-arnaudb.json
  • 14:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2123 (T357189)', diff saved to https://phabricator.wikimedia.org/P57341 and previous config saved to /var/cache/conftool/dbconfig/20240220-144001-arnaudb.json
  • 14:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 14:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 14:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T357189)', diff saved to https://phabricator.wikimedia.org/P57340 and previous config saved to /var/cache/conftool/dbconfig/20240220-143939-arnaudb.json
  • 14:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57339 and previous config saved to /var/cache/conftool/dbconfig/20240220-143619-root.json
  • 14:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 25%: After migration', diff saved to https://phabricator.wikimedia.org/P57338 and previous config saved to /var/cache/conftool/dbconfig/20240220-143258-root.json
  • 14:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57337 and previous config saved to /var/cache/conftool/dbconfig/20240220-143249-root.json
  • 14:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P57336 and previous config saved to /var/cache/conftool/dbconfig/20240220-143032-marostegui.json
  • 14:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P57334 and previous config saved to /var/cache/conftool/dbconfig/20240220-142433-arnaudb.json
  • 14:21 claime: launching build-production-images - T342346
  • 14:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57333 and previous config saved to /var/cache/conftool/dbconfig/20240220-142114-root.json
  • 14:20 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 14:19 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2005.codfw.wmnet with OS bookworm
  • 14:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 10%: After migration', diff saved to https://phabricator.wikimedia.org/P57332 and previous config saved to /var/cache/conftool/dbconfig/20240220-141752-root.json
  • 14:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57331 and previous config saved to /var/cache/conftool/dbconfig/20240220-141744-root.json
  • 14:15 claime: Uncordoning mw2379
  • 14:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T355609)', diff saved to https://phabricator.wikimedia.org/P57330 and previous config saved to /var/cache/conftool/dbconfig/20240220-141525-marostegui.json
  • 14:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P57329 and previous config saved to /var/cache/conftool/dbconfig/20240220-140926-arnaudb.json
  • 14:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57328 and previous config saved to /var/cache/conftool/dbconfig/20240220-140609-root.json
  • 14:05 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 14:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 5%: After migration', diff saved to https://phabricator.wikimedia.org/P57327 and previous config saved to /var/cache/conftool/dbconfig/20240220-140247-root.json
  • 14:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57326 and previous config saved to /var/cache/conftool/dbconfig/20240220-140239-root.json
  • 13:55 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2005.codfw.wmnet with reason: sretest
  • 13:55 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2005.codfw.wmnet with reason: sretest
  • 13:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T357189)', diff saved to https://phabricator.wikimedia.org/P57325 and previous config saved to /var/cache/conftool/dbconfig/20240220-135420-arnaudb.json
  • 13:54 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet,service=s5
  • 13:54 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet,service=s8
  • 13:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57324 and previous config saved to /var/cache/conftool/dbconfig/20240220-135104-root.json
  • 13:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2111 (T357189)', diff saved to https://phabricator.wikimedia.org/P57323 and previous config saved to /var/cache/conftool/dbconfig/20240220-134958-arnaudb.json
  • 13:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 13:49 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 13:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 1%: After migration', diff saved to https://phabricator.wikimedia.org/P57322 and previous config saved to /var/cache/conftool/dbconfig/20240220-134742-root.json
  • 13:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57321 and previous config saved to /var/cache/conftool/dbconfig/20240220-134734-root.json
  • 13:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 13:47 jynus: setting up mariadb instances at db2097
  • 13:47 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 13:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 13:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T355609)', diff saved to https://phabricator.wikimedia.org/P57320 and previous config saved to /var/cache/conftool/dbconfig/20240220-134403-marostegui.json
  • 13:44 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 13:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T357189)', diff saved to https://phabricator.wikimedia.org/P57319 and previous config saved to /var/cache/conftool/dbconfig/20240220-134354-arnaudb.json
  • 13:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 13:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 13:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T355609)', diff saved to https://phabricator.wikimedia.org/P57318 and previous config saved to /var/cache/conftool/dbconfig/20240220-134334-marostegui.json
  • 13:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P57317 and previous config saved to /var/cache/conftool/dbconfig/20240220-132848-arnaudb.json
  • 13:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P57316 and previous config saved to /var/cache/conftool/dbconfig/20240220-132827-marostegui.json
  • 13:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P57315 and previous config saved to /var/cache/conftool/dbconfig/20240220-131341-arnaudb.json
  • 13:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P57314 and previous config saved to /var/cache/conftool/dbconfig/20240220-131320-marostegui.json
  • 13:08 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2190.codfw.wmnet onto db2194.codfw.wmnet
  • 12:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T357189)', diff saved to https://phabricator.wikimedia.org/P57313 and previous config saved to /var/cache/conftool/dbconfig/20240220-125835-arnaudb.json
  • 12:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T355609)', diff saved to https://phabricator.wikimedia.org/P57312 and previous config saved to /var/cache/conftool/dbconfig/20240220-125814-marostegui.json
  • 12:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1230 (T357189)', diff saved to https://phabricator.wikimedia.org/P57311 and previous config saved to /var/cache/conftool/dbconfig/20240220-125516-arnaudb.json
  • 12:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 12:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 12:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 12:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 12:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T357189)', diff saved to https://phabricator.wikimedia.org/P57310 and previous config saved to /var/cache/conftool/dbconfig/20240220-125311-arnaudb.json
  • 12:48 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=s8
  • 12:48 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=s5
  • 12:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P57309 and previous config saved to /var/cache/conftool/dbconfig/20240220-123804-arnaudb.json
  • 12:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T355609)', diff saved to https://phabricator.wikimedia.org/P57308 and previous config saved to /var/cache/conftool/dbconfig/20240220-122947-marostegui.json
  • 12:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 12:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 12:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 12:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 12:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T355609)', diff saved to https://phabricator.wikimedia.org/P57307 and previous config saved to /var/cache/conftool/dbconfig/20240220-122907-marostegui.json
  • 12:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P57306 and previous config saved to /var/cache/conftool/dbconfig/20240220-122258-arnaudb.json
  • 12:18 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2384.codfw.wmnet with OS bullseye
  • 12:18 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2385.codfw.wmnet with OS bullseye
  • 12:16 claime: Draining mw2379
  • 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P57305 and previous config saved to /var/cache/conftool/dbconfig/20240220-121402-marostegui.json
  • 12:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T357189)', diff saved to https://phabricator.wikimedia.org/P57304 and previous config saved to /var/cache/conftool/dbconfig/20240220-120752-arnaudb.json
  • 12:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1213 (T357189)', diff saved to https://phabricator.wikimedia.org/P57303 and previous config saved to /var/cache/conftool/dbconfig/20240220-120434-arnaudb.json
  • 12:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 12:04 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 12:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T357189)', diff saved to https://phabricator.wikimedia.org/P57302 and previous config saved to /var/cache/conftool/dbconfig/20240220-120412-arnaudb.json
  • 12:04 kart_: cxserver: Update to 2024-02-15-085232-production + Bump mesh.configuration to 1.7 (T333969, T352747, T355686, T255568)
  • 12:03 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2385.codfw.wmnet with OS bullseye
  • 12:03 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2385.codfw.wmnet with OS bullseye
  • 12:02 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2384.codfw.wmnet with OS bullseye
  • 12:02 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2384.codfw.wmnet with OS bullseye
  • 12:01 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2369.codfw.wmnet with OS bullseye
  • 12:00 marostegui@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57301 and previous config saved to /var/cache/conftool/dbconfig/20240220-120031-root.json
  • 12:00 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 11:59 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 11:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P57300 and previous config saved to /var/cache/conftool/dbconfig/20240220-115855-marostegui.json
  • 11:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2367.codfw.wmnet with OS bullseye
  • 11:55 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2313.codfw.wmnet with OS bullseye
  • 11:55 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 11:54 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 11:51 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2312.codfw.wmnet with OS bullseye
  • 11:51 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 11:50 sukhe: updating pdns-recursor to 4.8.6-1 on doh* hosts
  • 11:50 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 11:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P57299 and previous config saved to /var/cache/conftool/dbconfig/20240220-114906-arnaudb.json
  • 11:45 marostegui@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57298 and previous config saved to /var/cache/conftool/dbconfig/20240220-114526-root.json
  • 11:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T355609)', diff saved to https://phabricator.wikimedia.org/P57297 and previous config saved to /var/cache/conftool/dbconfig/20240220-114349-marostegui.json
  • 11:42 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2369.codfw.wmnet with reason: host reimage
  • 11:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2367.codfw.wmnet with reason: host reimage
  • 11:37 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2313.codfw.wmnet with reason: host reimage
  • 11:35 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2367.codfw.wmnet with reason: host reimage
  • 11:35 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2369.codfw.wmnet with reason: host reimage
  • 11:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P57296 and previous config saved to /var/cache/conftool/dbconfig/20240220-113401-arnaudb.json
  • 11:33 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2312.codfw.wmnet with reason: host reimage
  • 11:33 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db2190.codfw.wmnet onto db2194.codfw.wmnet
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2312.codfw.wmnet with reason: host reimage
  • 11:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57295 and previous config saved to /var/cache/conftool/dbconfig/20240220-113021-root.json
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2385.codfw.wmnet with OS bullseye
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2384.codfw.wmnet with OS bullseye
  • 11:29 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2385.codfw.wmnet with OS bullseye
  • 11:29 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2384.codfw.wmnet with OS bullseye
  • 11:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2194.codfw.wmnet with OS bookworm
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2385.codfw.wmnet with OS bullseye
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2384.codfw.wmnet with OS bullseye
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2369.codfw.wmnet with OS bullseye
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2367.codfw.wmnet with OS bullseye
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2313.codfw.wmnet with OS bullseye
  • 11:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T357189)', diff saved to https://phabricator.wikimedia.org/P57294 and previous config saved to /var/cache/conftool/dbconfig/20240220-111854-arnaudb.json
  • 11:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T355609)', diff saved to https://phabricator.wikimedia.org/P57293 and previous config saved to /var/cache/conftool/dbconfig/20240220-111722-marostegui.json
  • 11:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 11:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 11:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T355609)', diff saved to https://phabricator.wikimedia.org/P57292 and previous config saved to /var/cache/conftool/dbconfig/20240220-111700-marostegui.json
  • 11:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T357189)', diff saved to https://phabricator.wikimedia.org/P57291 and previous config saved to /var/cache/conftool/dbconfig/20240220-111531-arnaudb.json
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57290 and previous config saved to /var/cache/conftool/dbconfig/20240220-111525-root.json
  • 11:15 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57289 and previous config saved to /var/cache/conftool/dbconfig/20240220-111516-root.json
  • 11:15 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 11:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T357189)', diff saved to https://phabricator.wikimedia.org/P57288 and previous config saved to /var/cache/conftool/dbconfig/20240220-111510-arnaudb.json
  • 11:14 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2312.codfw.wmnet with OS bullseye
  • 11:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2194.codfw.wmnet with reason: host reimage
  • 11:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2194.codfw.wmnet with reason: host reimage
  • 11:04 marostegui@cumin1002: dbctl commit (dc=all): 'Place db2194 in s3 depooled T354826', diff saved to https://phabricator.wikimedia.org/P57287 and previous config saved to /var/cache/conftool/dbconfig/20240220-110444-marostegui.json
  • 11:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P57286 and previous config saved to /var/cache/conftool/dbconfig/20240220-110154-marostegui.json
  • 11:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2190', diff saved to https://phabricator.wikimedia.org/P57285 and previous config saved to /var/cache/conftool/dbconfig/20240220-110020-root.json
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57284 and previous config saved to /var/cache/conftool/dbconfig/20240220-110011-root.json
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57283 and previous config saved to /var/cache/conftool/dbconfig/20240220-110008-root.json
  • 11:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P57282 and previous config saved to /var/cache/conftool/dbconfig/20240220-110004-arnaudb.json
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2194 multi instance', diff saved to https://phabricator.wikimedia.org/P57281 and previous config saved to /var/cache/conftool/dbconfig/20240220-105959-marostegui.json
  • 10:56 slyngs: Import CAS 6.6.12+wmf11u2 in apt-repo
  • 10:50 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on cloudvirt1032.eqiad.wmnet with reason: nova-compute registration
  • 10:50 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on cloudvirt1032.eqiad.wmnet with reason: nova-compute registration
  • 10:48 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2194.codfw.wmnet with OS bookworm
  • 10:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P57280 and previous config saved to /var/cache/conftool/dbconfig/20240220-104647-marostegui.json
  • 10:46 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2194', diff saved to https://phabricator.wikimedia.org/P57279 and previous config saved to /var/cache/conftool/dbconfig/20240220-104633-root.json
  • 10:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57278 and previous config saved to /var/cache/conftool/dbconfig/20240220-104231-root.json
  • 10:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P57277 and previous config saved to /var/cache/conftool/dbconfig/20240220-104209-arnaudb.json
  • 10:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2169.codfw.wmnet with OS bookworm
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57276 and previous config saved to /var/cache/conftool/dbconfig/20240220-103842-root.json
  • 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on cumin1001.eqiad.wmnet with reason: being taken down
  • 10:34 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on cumin1001.eqiad.wmnet with reason: being taken down
  • 10:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T355609)', diff saved to https://phabricator.wikimedia.org/P57275 and previous config saved to /var/cache/conftool/dbconfig/20240220-103141-marostegui.json
  • 10:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T357189)', diff saved to https://phabricator.wikimedia.org/P57274 and previous config saved to /var/cache/conftool/dbconfig/20240220-102703-arnaudb.json
  • 10:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T357189)', diff saved to https://phabricator.wikimedia.org/P57273 and previous config saved to /var/cache/conftool/dbconfig/20240220-102344-arnaudb.json
  • 10:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57272 and previous config saved to /var/cache/conftool/dbconfig/20240220-102337-root.json
  • 10:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 10:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T357189)', diff saved to https://phabricator.wikimedia.org/P57271 and previous config saved to /var/cache/conftool/dbconfig/20240220-102322-arnaudb.json
  • 10:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
  • 10:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
  • 10:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57270 and previous config saved to /var/cache/conftool/dbconfig/20240220-101206-root.json
  • 10:10 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 10:10 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 10:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57269 and previous config saved to /var/cache/conftool/dbconfig/20240220-100832-root.json
  • 10:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P57268 and previous config saved to /var/cache/conftool/dbconfig/20240220-100816-arnaudb.json
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'Add db2169 to s6 depooled', diff saved to https://phabricator.wikimedia.org/P57267 and previous config saved to /var/cache/conftool/dbconfig/20240220-100623-marostegui.json
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T355609)', diff saved to https://phabricator.wikimedia.org/P57266 and previous config saved to /var/cache/conftool/dbconfig/20240220-100511-marostegui.json
  • 10:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 10:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T355609)', diff saved to https://phabricator.wikimedia.org/P57265 and previous config saved to /var/cache/conftool/dbconfig/20240220-100449-marostegui.json
  • 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2169 multiinstance', diff saved to https://phabricator.wikimedia.org/P57264 and previous config saved to /var/cache/conftool/dbconfig/20240220-100444-marostegui.json
  • 10:00 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 09:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57263 and previous config saved to /var/cache/conftool/dbconfig/20240220-095701-root.json
  • 09:56 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2169.codfw.wmnet with OS bookworm
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2169', diff saved to https://phabricator.wikimedia.org/P57262 and previous config saved to /var/cache/conftool/dbconfig/20240220-095353-root.json
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57261 and previous config saved to /var/cache/conftool/dbconfig/20240220-095327-root.json
  • 09:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P57260 and previous config saved to /var/cache/conftool/dbconfig/20240220-095310-arnaudb.json
  • 09:49 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 09:46 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 09:46 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P57259 and previous config saved to /var/cache/conftool/dbconfig/20240220-094334-marostegui.json
  • 09:41 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57258 and previous config saved to /var/cache/conftool/dbconfig/20240220-094156-root.json
  • 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T357189)', diff saved to https://phabricator.wikimedia.org/P57257 and previous config saved to /var/cache/conftool/dbconfig/20240220-093803-arnaudb.json
  • 09:36 moritzm: installing imagemagick security updates
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57256 and previous config saved to /var/cache/conftool/dbconfig/20240220-093607-root.json
  • 09:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T357189)', diff saved to https://phabricator.wikimedia.org/P57255 and previous config saved to /var/cache/conftool/dbconfig/20240220-093442-arnaudb.json
  • 09:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 09:34 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 09:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T357189)', diff saved to https://phabricator.wikimedia.org/P57254 and previous config saved to /var/cache/conftool/dbconfig/20240220-093420-arnaudb.json
  • 09:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P57253 and previous config saved to /var/cache/conftool/dbconfig/20240220-092827-marostegui.json
  • 09:26 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57252 and previous config saved to /var/cache/conftool/dbconfig/20240220-092651-root.json
  • 09:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS bookworm
  • 09:23 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 09:22 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 09:21 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 09:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57251 and previous config saved to /var/cache/conftool/dbconfig/20240220-092102-root.json
  • 09:21 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 09:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P57250 and previous config saved to /var/cache/conftool/dbconfig/20240220-091914-arnaudb.json
  • 09:16 akosiaris@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:16 dcausse@deploy2002: Finished deploy [airflow-dags/search@088b013]: search: wdqs updater set proper start date (duration: 00m 26s)
  • 09:16 dcausse@deploy2002: Started deploy [airflow-dags/search@088b013]: search: wdqs updater set proper start date
  • 09:15 akosiaris@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T355609)', diff saved to https://phabricator.wikimedia.org/P57249 and previous config saved to /var/cache/conftool/dbconfig/20240220-091321-marostegui.json
  • 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57248 and previous config saved to /var/cache/conftool/dbconfig/20240220-091146-root.json
  • 09:09 akosiaris@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:08 akosiaris@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57247 and previous config saved to /var/cache/conftool/dbconfig/20240220-090557-root.json
  • 09:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
  • 09:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P57246 and previous config saved to /var/cache/conftool/dbconfig/20240220-090408-arnaudb.json
  • 09:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
  • 09:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2138.codfw.wmnet with OS bookworm
  • 08:57 dcausse@deploy2002: Finished deploy [airflow-dags/search@a6356d2]: search: wdqs-updater reconcile, do not create the dag dynamically (duration: 00m 28s)
  • 08:56 dcausse@deploy2002: Started deploy [airflow-dags/search@a6356d2]: search: wdqs-updater reconcile, do not create the dag dynamically
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57245 and previous config saved to /var/cache/conftool/dbconfig/20240220-085641-root.json
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57244 and previous config saved to /var/cache/conftool/dbconfig/20240220-085222-root.json
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57243 and previous config saved to /var/cache/conftool/dbconfig/20240220-085052-root.json
  • 08:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T357189)', diff saved to https://phabricator.wikimedia.org/P57242 and previous config saved to /var/cache/conftool/dbconfig/20240220-084901-arnaudb.json
  • 08:46 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T355609)', diff saved to https://phabricator.wikimedia.org/P57241 and previous config saved to /var/cache/conftool/dbconfig/20240220-084637-marostegui.json
  • 08:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 08:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 08:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T357189)', diff saved to https://phabricator.wikimedia.org/P57240 and previous config saved to /var/cache/conftool/dbconfig/20240220-084530-arnaudb.json
  • 08:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 08:44 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 08:43 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS bookworm
  • 08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2167', diff saved to https://phabricator.wikimedia.org/P57239 and previous config saved to /var/cache/conftool/dbconfig/20240220-084136-root.json
  • 08:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2138.codfw.wmnet with reason: host reimage
  • 08:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2138.codfw.wmnet with reason: host reimage
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57238 and previous config saved to /var/cache/conftool/dbconfig/20240220-083718-root.json
  • 08:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57237 and previous config saved to /var/cache/conftool/dbconfig/20240220-083547-root.json
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 5%: After migration', diff saved to https://phabricator.wikimedia.org/P57236 and previous config saved to /var/cache/conftool/dbconfig/20240220-083132-root.json
  • 08:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57235 and previous config saved to /var/cache/conftool/dbconfig/20240220-082515-root.json
  • 08:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2168.codfw.wmnet with OS bookworm
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57234 and previous config saved to /var/cache/conftool/dbconfig/20240220-082213-root.json
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57233 and previous config saved to /var/cache/conftool/dbconfig/20240220-082043-root.json
  • 08:19 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2138.codfw.wmnet with OS bookworm
  • 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2138', diff saved to https://phabricator.wikimedia.org/P57232 and previous config saved to /var/cache/conftool/dbconfig/20240220-081740-root.json
  • 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 1%: After migration', diff saved to https://phabricator.wikimedia.org/P57231 and previous config saved to /var/cache/conftool/dbconfig/20240220-081627-root.json
  • 08:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2170.codfw.wmnet with OS bookworm
  • 08:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57230 and previous config saved to /var/cache/conftool/dbconfig/20240220-081353-root.json
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57229 and previous config saved to /var/cache/conftool/dbconfig/20240220-081010-root.json
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57228 and previous config saved to /var/cache/conftool/dbconfig/20240220-080708-root.json
  • 08:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
  • 08:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
  • 07:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57227 and previous config saved to /var/cache/conftool/dbconfig/20240220-075848-root.json
  • 07:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57226 and previous config saved to /var/cache/conftool/dbconfig/20240220-075505-root.json
  • 07:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
  • 07:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57225 and previous config saved to /var/cache/conftool/dbconfig/20240220-075203-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 100%: After migration', diff saved to https://phabricator.wikimedia.org/P57224 and previous config saved to /var/cache/conftool/dbconfig/20240220-075128-root.json
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57223 and previous config saved to /var/cache/conftool/dbconfig/20240220-074343-root.json
  • 07:40 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2168.codfw.wmnet with OS bookworm
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57222 and previous config saved to /var/cache/conftool/dbconfig/20240220-074000-root.json
  • 07:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2168', diff saved to https://phabricator.wikimedia.org/P57221 and previous config saved to /var/cache/conftool/dbconfig/20240220-073912-root.json
  • 07:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2028.codfw.wmnet
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57220 and previous config saved to /var/cache/conftool/dbconfig/20240220-073658-root.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 75%: After migration', diff saved to https://phabricator.wikimedia.org/P57219 and previous config saved to /var/cache/conftool/dbconfig/20240220-073623-root.json
  • 07:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2028.codfw.wmnet
  • 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57218 and previous config saved to /var/cache/conftool/dbconfig/20240220-073313-root.json
  • 07:32 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2170.codfw.wmnet with OS bookworm
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2170', diff saved to https://phabricator.wikimedia.org/P57217 and previous config saved to /var/cache/conftool/dbconfig/20240220-073139-root.json
  • 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57216 and previous config saved to /var/cache/conftool/dbconfig/20240220-072838-root.json
  • 07:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2171.codfw.wmnet with OS bookworm
  • 07:27 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 56286
  • 07:27 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 56286
  • 07:27 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 60501
  • 07:26 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 60501
  • 07:26 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 18779
  • 07:26 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 18779
  • 07:26 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 26554
  • 07:25 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 26554
  • 07:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57215 and previous config saved to /var/cache/conftool/dbconfig/20240220-072455-root.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 50%: After migration', diff saved to https://phabricator.wikimedia.org/P57214 and previous config saved to /var/cache/conftool/dbconfig/20240220-072118-root.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57213 and previous config saved to /var/cache/conftool/dbconfig/20240220-071808-root.json
  • 07:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57212 and previous config saved to /var/cache/conftool/dbconfig/20240220-071333-root.json
  • 07:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57211 and previous config saved to /var/cache/conftool/dbconfig/20240220-070948-root.json
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 25%: After migration', diff saved to https://phabricator.wikimedia.org/P57210 and previous config saved to /var/cache/conftool/dbconfig/20240220-070613-root.json
  • 07:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1246.eqiad.wmnet with OS bookworm
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57209 and previous config saved to /var/cache/conftool/dbconfig/20240220-070303-root.json
  • 07:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2171.codfw.wmnet with reason: host reimage
  • 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1244.eqiad.wmnet with OS bookworm
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57208 and previous config saved to /var/cache/conftool/dbconfig/20240220-065828-root.json
  • 06:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2171.codfw.wmnet with reason: host reimage
  • 06:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 10%: After migration', diff saved to https://phabricator.wikimedia.org/P57207 and previous config saved to /var/cache/conftool/dbconfig/20240220-065108-root.json
  • 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57206 and previous config saved to /var/cache/conftool/dbconfig/20240220-064758-root.json
  • 06:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1246.eqiad.wmnet with reason: host reimage
  • 06:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1246.eqiad.wmnet with reason: host reimage
  • 06:41 marostegui@cumin1002: dbctl commit (dc=all): 'Place db2171 in s5 depooled T354826', diff saved to https://phabricator.wikimedia.org/P57205 and previous config saved to /var/cache/conftool/dbconfig/20240220-064152-marostegui.json
  • 06:40 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2171 multi-instance', diff saved to https://phabricator.wikimedia.org/P57204 and previous config saved to /var/cache/conftool/dbconfig/20240220-064014-marostegui.json
  • 06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1244.eqiad.wmnet with reason: host reimage
  • 06:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2171.codfw.wmnet with OS bookworm
  • 06:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1244.eqiad.wmnet with reason: host reimage
  • 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 5%: After migration', diff saved to https://phabricator.wikimedia.org/P57203 and previous config saved to /var/cache/conftool/dbconfig/20240220-063603-root.json
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2171 T354826', diff saved to https://phabricator.wikimedia.org/P57202 and previous config saved to /var/cache/conftool/dbconfig/20240220-063521-marostegui.json
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57201 and previous config saved to /var/cache/conftool/dbconfig/20240220-063254-root.json
  • 06:29 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1246', diff saved to https://phabricator.wikimedia.org/P57200 and previous config saved to /var/cache/conftool/dbconfig/20240220-062759-root.json
  • 06:24 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1244.eqiad.wmnet with OS bookworm
  • 06:22 marostegui@deploy2002: Finished scap: Backport for Revert "db-production.php: Disable writes on es4" (duration: 09m 32s)
  • 06:20 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 1%: After migration', diff saved to https://phabricator.wikimedia.org/P57199 and previous config saved to /var/cache/conftool/dbconfig/20240220-062058-root.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1244', diff saved to https://phabricator.wikimedia.org/P57198 and previous config saved to /var/cache/conftool/dbconfig/20240220-061932-root.json
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57197 and previous config saved to /var/cache/conftool/dbconfig/20240220-061749-root.json
  • 06:14 marostegui@deploy2002: marostegui: Continuing with sync
  • 06:14 marostegui@deploy2002: marostegui: Backport for Revert "db-production.php: Disable writes on es4" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 06:13 marostegui@deploy2002: Started scap: Backport for Revert "db-production.php: Disable writes on es4"
  • 06:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS bookworm
  • 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'Add weight to es2020', diff saved to https://phabricator.wikimedia.org/P57196 and previous config saved to /var/cache/conftool/dbconfig/20240220-061049-root.json
  • 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2021 T356372', diff saved to https://phabricator.wikimedia.org/P57195 and previous config saved to /var/cache/conftool/dbconfig/20240220-061025-marostegui.json
  • 06:08 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2020 to es4 primary T356372', diff saved to https://phabricator.wikimedia.org/P57194 and previous config saved to /var/cache/conftool/dbconfig/20240220-060852-marostegui.json
  • 06:08 marostegui: Starting es4 codfw failover from es2021 to es2020 - T356372
  • 06:04 marostegui@cumin1002: dbctl commit (dc=all): 'Set es2020 with weight 0 T356372', diff saved to https://phabricator.wikimedia.org/P57193 and previous config saved to /var/cache/conftool/dbconfig/20240220-060404-marostegui.json
  • 06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356372
  • 06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356372
  • 06:01 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2137.codfw.wmnet with OS bookworm
  • 06:00 marostegui@deploy2002: Finished scap: Backport for db-production.php: Disable writes on es4 (T356372) (duration: 09m 36s)
  • 05:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
  • 05:55 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2137.codfw.wmnet with OS bookworm
  • 05:54 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2137.codfw.wmnet with OS bookworm
  • 05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
  • 05:52 marostegui@deploy2002: marostegui: Continuing with sync
  • 05:52 marostegui@deploy2002: marostegui: Backport for db-production.php: Disable writes on es4 (T356372) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 05:50 marostegui@deploy2002: Started scap: Backport for db-production.php: Disable writes on es4 (T356372)
  • 05:45 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2137.codfw.wmnet with OS bookworm
  • 05:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2137 for reimage', diff saved to https://phabricator.wikimedia.org/P57192 and previous config saved to /var/cache/conftool/dbconfig/20240220-054156-marostegui.json
  • 05:41 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS bookworm
  • 05:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1170 for reimage', diff saved to https://phabricator.wikimedia.org/P57191 and previous config saved to /var/cache/conftool/dbconfig/20240220-053920-marostegui.json
  • 04:56 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.19 refs T354437 (duration: 52m 09s)
  • 04:04 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.19 refs T354437
  • 04:02 mwpresync@deploy2002: Pruned MediaWiki: 1.42.0-wmf.16 (duration: 01m 57s)
  • 02:15 tstarling@deploy2002: Synchronized wmf-config/CommonSettings.php: Set $wgLoginNotifyUseCheckUser = false T346989 (duration: 08m 13s)

2024-02-19

  • 23:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 23:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 23:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57190 and previous config saved to /var/cache/conftool/dbconfig/20240219-234251-arnaudb.json
  • 23:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P57189 and previous config saved to /var/cache/conftool/dbconfig/20240219-232745-arnaudb.json
  • 23:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P57188 and previous config saved to /var/cache/conftool/dbconfig/20240219-231238-arnaudb.json
  • 22:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57187 and previous config saved to /var/cache/conftool/dbconfig/20240219-225732-arnaudb.json
  • 22:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57186 and previous config saved to /var/cache/conftool/dbconfig/20240219-224117-arnaudb.json
  • 22:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 22:40 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 22:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T357189)', diff saved to https://phabricator.wikimedia.org/P57185 and previous config saved to /var/cache/conftool/dbconfig/20240219-224054-arnaudb.json
  • 22:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P57184 and previous config saved to /var/cache/conftool/dbconfig/20240219-222547-arnaudb.json
  • 22:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P57183 and previous config saved to /var/cache/conftool/dbconfig/20240219-221239-ladsgroup.json
  • 22:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P57182 and previous config saved to /var/cache/conftool/dbconfig/20240219-221041-arnaudb.json
  • 21:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P57181 and previous config saved to /var/cache/conftool/dbconfig/20240219-215733-ladsgroup.json
  • 21:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T357189)', diff saved to https://phabricator.wikimedia.org/P57180 and previous config saved to /var/cache/conftool/dbconfig/20240219-215534-arnaudb.json
  • 21:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T357189)', diff saved to https://phabricator.wikimedia.org/P57179 and previous config saved to /var/cache/conftool/dbconfig/20240219-215217-arnaudb.json
  • 21:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 21:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 21:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T357189)', diff saved to https://phabricator.wikimedia.org/P57178 and previous config saved to /var/cache/conftool/dbconfig/20240219-215155-arnaudb.json
  • 21:42 zabe@deploy2002: Finished scap: Backport for EditAttemptStep: log buckets for the edit check test (T342930), Enrollment for the edit check a/b test (T342930), Launch the Visual Editor edit check a/b test (T342930 T352127), Default VE on mobile for other wikis (T352127) (duration: 17m 25s)
  • 21:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P57177 and previous config saved to /var/cache/conftool/dbconfig/20240219-214227-ladsgroup.json
  • 21:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P57176 and previous config saved to /var/cache/conftool/dbconfig/20240219-213648-arnaudb.json
  • 21:35 zabe@deploy2002: kemayo and zabe: Continuing with sync
  • 21:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P57175 and previous config saved to /var/cache/conftool/dbconfig/20240219-212720-ladsgroup.json
  • 21:26 zabe@deploy2002: kemayo and zabe: Backport for EditAttemptStep: log buckets for the edit check test (T342930), Enrollment for the edit check a/b test (T342930), Launch the Visual Editor edit check a/b test (T342930 T352127), Default VE on mobile for other wikis (T352127) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:25 zabe@deploy2002: Started scap: Backport for EditAttemptStep: log buckets for the edit check test (T342930), Enrollment for the edit check a/b test (T342930), Launch the Visual Editor edit check a/b test (T342930 T352127), Default VE on mobile for other wikis (T352127)
  • 21:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P57174 and previous config saved to /var/cache/conftool/dbconfig/20240219-212141-arnaudb.json
  • 21:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T357189)', diff saved to https://phabricator.wikimedia.org/P57173 and previous config saved to /var/cache/conftool/dbconfig/20240219-210635-arnaudb.json
  • 21:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3316 (T357189)', diff saved to https://phabricator.wikimedia.org/P57172 and previous config saved to /var/cache/conftool/dbconfig/20240219-210228-arnaudb.json
  • 21:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 21:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T357189)', diff saved to https://phabricator.wikimedia.org/P57171 and previous config saved to /var/cache/conftool/dbconfig/20240219-205935-arnaudb.json
  • 20:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 100%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57169 and previous config saved to /var/cache/conftool/dbconfig/20240219-205047-arnaudb.json
  • 20:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P57168 and previous config saved to /var/cache/conftool/dbconfig/20240219-204429-arnaudb.json
  • 20:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 75%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57167 and previous config saved to /var/cache/conftool/dbconfig/20240219-203542-arnaudb.json
  • 20:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P57166 and previous config saved to /var/cache/conftool/dbconfig/20240219-202923-arnaudb.json
  • 20:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P57165 and previous config saved to /var/cache/conftool/dbconfig/20240219-202648-ladsgroup.json
  • 20:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 20:26 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 20:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P57164 and previous config saved to /var/cache/conftool/dbconfig/20240219-202615-ladsgroup.json
  • 20:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 50%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57163 and previous config saved to /var/cache/conftool/dbconfig/20240219-202037-arnaudb.json
  • 20:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T357189)', diff saved to https://phabricator.wikimedia.org/P57162 and previous config saved to /var/cache/conftool/dbconfig/20240219-201416-arnaudb.json
  • 20:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P57161 and previous config saved to /var/cache/conftool/dbconfig/20240219-201353-root.json
  • 20:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P57160 and previous config saved to /var/cache/conftool/dbconfig/20240219-201109-ladsgroup.json
  • 20:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T357189)', diff saved to https://phabricator.wikimedia.org/P57159 and previous config saved to /var/cache/conftool/dbconfig/20240219-200914-arnaudb.json
  • 20:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 20:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 20:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 20:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 20:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T357189)', diff saved to https://phabricator.wikimedia.org/P57158 and previous config saved to /var/cache/conftool/dbconfig/20240219-200847-arnaudb.json
  • 20:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 40%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57157 and previous config saved to /var/cache/conftool/dbconfig/20240219-200533-arnaudb.json
  • 20:05 zabe@deploy2002: Finished scap: Backport for Remove reviewer group from testwiki (T356012) (duration: 09m 16s)
  • 19:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P57156 and previous config saved to /var/cache/conftool/dbconfig/20240219-195848-root.json
  • 19:57 zabe@deploy2002: zabe: Continuing with sync
  • 19:57 zabe@deploy2002: zabe: Backport for Remove reviewer group from testwiki (T356012) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:56 zabe@deploy2002: Started scap: Backport for Remove reviewer group from testwiki (T356012)
  • 19:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P57155 and previous config saved to /var/cache/conftool/dbconfig/20240219-195603-ladsgroup.json
  • 19:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P57154 and previous config saved to /var/cache/conftool/dbconfig/20240219-195341-arnaudb.json
  • 19:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 30%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57153 and previous config saved to /var/cache/conftool/dbconfig/20240219-195028-arnaudb.json
  • 19:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P57152 and previous config saved to /var/cache/conftool/dbconfig/20240219-194343-root.json
  • 19:42 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript emptyUserGroup.php --wiki=testwiki reviewer # T356012
  • 19:41 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --user="Yann" --overwrite . # T357218
  • 19:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P57151 and previous config saved to /var/cache/conftool/dbconfig/20240219-194056-ladsgroup.json
  • 19:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P57150 and previous config saved to /var/cache/conftool/dbconfig/20240219-193834-arnaudb.json
  • 19:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 20%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57149 and previous config saved to /var/cache/conftool/dbconfig/20240219-193522-arnaudb.json
  • 19:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P57148 and previous config saved to /var/cache/conftool/dbconfig/20240219-192838-root.json
  • 19:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2149.codfw.wmnet onto db2156.codfw.wmnet
  • 19:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T357189)', diff saved to https://phabricator.wikimedia.org/P57147 and previous config saved to /var/cache/conftool/dbconfig/20240219-192327-arnaudb.json
  • 19:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 10%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57146 and previous config saved to /var/cache/conftool/dbconfig/20240219-192018-arnaudb.json
  • 19:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T357189)', diff saved to https://phabricator.wikimedia.org/P57145 and previous config saved to /var/cache/conftool/dbconfig/20240219-191923-arnaudb.json
  • 19:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 19:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 19:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T357189)', diff saved to https://phabricator.wikimedia.org/P57144 and previous config saved to /var/cache/conftool/dbconfig/20240219-191901-arnaudb.json
  • 19:14 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user="Yann" . # T357297
  • 19:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 8%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57143 and previous config saved to /var/cache/conftool/dbconfig/20240219-190513-arnaudb.json
  • 19:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P57142 and previous config saved to /var/cache/conftool/dbconfig/20240219-190354-arnaudb.json
  • 18:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 4%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57141 and previous config saved to /var/cache/conftool/dbconfig/20240219-185008-arnaudb.json
  • 18:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P57140 and previous config saved to /var/cache/conftool/dbconfig/20240219-184848-arnaudb.json
  • 18:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 2%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57139 and previous config saved to /var/cache/conftool/dbconfig/20240219-183503-arnaudb.json
  • 18:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T357189)', diff saved to https://phabricator.wikimedia.org/P57138 and previous config saved to /var/cache/conftool/dbconfig/20240219-183341-arnaudb.json
  • 18:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T357189)', diff saved to https://phabricator.wikimedia.org/P57137 and previous config saved to /var/cache/conftool/dbconfig/20240219-182929-arnaudb.json
  • 18:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 18:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 18:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T357189)', diff saved to https://phabricator.wikimedia.org/P57136 and previous config saved to /var/cache/conftool/dbconfig/20240219-182905-arnaudb.json
  • 18:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 1%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57135 and previous config saved to /var/cache/conftool/dbconfig/20240219-181958-arnaudb.json
  • 18:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 100%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57134 and previous config saved to /var/cache/conftool/dbconfig/20240219-181953-arnaudb.json
  • 18:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P57133 and previous config saved to /var/cache/conftool/dbconfig/20240219-181359-arnaudb.json
  • 18:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 75%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57132 and previous config saved to /var/cache/conftool/dbconfig/20240219-180448-arnaudb.json
  • 17:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P57131 and previous config saved to /var/cache/conftool/dbconfig/20240219-175853-arnaudb.json
  • 17:56 ladsgroup@cumin1002: START - Cookbook sre.mysql.clone of db2149.codfw.wmnet onto db2156.codfw.wmnet
  • 17:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 50%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57130 and previous config saved to /var/cache/conftool/dbconfig/20240219-174943-arnaudb.json
  • 17:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T357189)', diff saved to https://phabricator.wikimedia.org/P57129 and previous config saved to /var/cache/conftool/dbconfig/20240219-174347-arnaudb.json
  • 17:43 hnowlan: running `decommssion` for mw2312.codfw.wmnet,mw2313.codfw.wmnet,mw2367.codfw.wmnet,mw2369.codfw.wmnet,mw2384.codfw.wmnet,mw2385.codfw.wmnet before reimaging to k8s workers
  • 17:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2117 (T357189)', diff saved to https://phabricator.wikimedia.org/P57128 and previous config saved to /var/cache/conftool/dbconfig/20240219-173941-arnaudb.json
  • 17:39 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 17:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 17:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T357189)', diff saved to https://phabricator.wikimedia.org/P57127 and previous config saved to /var/cache/conftool/dbconfig/20240219-173919-arnaudb.json
  • 17:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: recloning db2156 (T352010)
  • 17:38 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: recloning db2156 (T352010)
  • 17:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 40%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57126 and previous config saved to /var/cache/conftool/dbconfig/20240219-173438-arnaudb.json
  • 17:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2149 for maint', diff saved to https://phabricator.wikimedia.org/P57125 and previous config saved to /var/cache/conftool/dbconfig/20240219-173411-ladsgroup.json
  • 17:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P57124 and previous config saved to /var/cache/conftool/dbconfig/20240219-172412-arnaudb.json
  • 17:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 30%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57123 and previous config saved to /var/cache/conftool/dbconfig/20240219-171933-arnaudb.json
  • 17:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P57122 and previous config saved to /var/cache/conftool/dbconfig/20240219-170906-arnaudb.json
  • 17:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 20%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57121 and previous config saved to /var/cache/conftool/dbconfig/20240219-170428-arnaudb.json
  • 16:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57120 and previous config saved to /var/cache/conftool/dbconfig/20240219-165503-root.json
  • 16:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T357189)', diff saved to https://phabricator.wikimedia.org/P57119 and previous config saved to /var/cache/conftool/dbconfig/20240219-165400-arnaudb.json
  • 16:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2114 (T357189)', diff saved to https://phabricator.wikimedia.org/P57118 and previous config saved to /var/cache/conftool/dbconfig/20240219-165032-arnaudb.json
  • 16:50 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 16:50 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 16:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 10%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57117 and previous config saved to /var/cache/conftool/dbconfig/20240219-164924-arnaudb.json
  • 16:48 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 16:48 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 16:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T357189)', diff saved to https://phabricator.wikimedia.org/P57116 and previous config saved to /var/cache/conftool/dbconfig/20240219-164809-arnaudb.json
  • 16:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57115 and previous config saved to /var/cache/conftool/dbconfig/20240219-163958-root.json
  • 16:38 jgiannelos@deploy2002: Finished deploy [restbase/deploy@7e5e720]: Disable parsoid storage on restbase[1031:1033] (duration: 01m 55s)
  • 16:36 jgiannelos@deploy2002: Started deploy [restbase/deploy@7e5e720]: Disable parsoid storage on restbase[1031:1033]
  • 16:35 jgiannelos@deploy2002: Finished deploy [restbase/deploy@7e5e720]: Disable parsoid storage on restbase[2033:2035] (duration: 01m 19s)
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 8%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57114 and previous config saved to /var/cache/conftool/dbconfig/20240219-163419-arnaudb.json
  • 16:33 jgiannelos@deploy2002: Started deploy [restbase/deploy@7e5e720]: Disable parsoid storage on restbase[2033:2035]
  • 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P57113 and previous config saved to /var/cache/conftool/dbconfig/20240219-163303-arnaudb.json
  • 16:32 jgiannelos@deploy2002: deploy aborted: Disable parsoid storage on all nodes (duration: 01m 57s)
  • 16:30 jgiannelos@deploy2002: Started deploy [restbase/deploy@7e5e720]: Disable parsoid storage on all nodes
  • 16:30 jgiannelos@deploy2002: Finished deploy [restbase/deploy@7e5e720]: Disable parsoid storage on all nodes (duration: 00m 07s)
  • 16:30 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 16:30 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
  • 16:30 jgiannelos@deploy2002: Started deploy [restbase/deploy@7e5e720]: Disable parsoid storage on all nodes
  • 16:29 jgiannelos@deploy2002: deploy aborted: Disable parsoid storage on all nodes (duration: 00m 08s)
  • 16:29 jgiannelos@deploy2002: Started deploy [restbase/deploy@7e5e720]: Disable parsoid storage on all nodes
  • 16:29 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 16:29 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 16:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57112 and previous config saved to /var/cache/conftool/dbconfig/20240219-162453-root.json
  • 16:21 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 00m 04s)
  • 16:21 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 4%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57111 and previous config saved to /var/cache/conftool/dbconfig/20240219-161914-arnaudb.json
  • 16:19 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 00m 07s)
  • 16:18 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 16:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P57110 and previous config saved to /var/cache/conftool/dbconfig/20240219-161756-arnaudb.json
  • 16:17 jgiannelos@deploy2002: deploy aborted: Deploy latest restbase config in all nodes (duration: 00m 04s)
  • 16:16 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Deploy latest restbase config in all nodes
  • 16:14 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 00m 08s)
  • 16:14 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 16:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57109 and previous config saved to /var/cache/conftool/dbconfig/20240219-160948-root.json
  • 16:04 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 00m 23s)
  • 16:04 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 16:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 2%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57108 and previous config saved to /var/cache/conftool/dbconfig/20240219-160409-arnaudb.json
  • 16:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T357189)', diff saved to https://phabricator.wikimedia.org/P57107 and previous config saved to /var/cache/conftool/dbconfig/20240219-160249-arnaudb.json
  • 16:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 100%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57106 and previous config saved to /var/cache/conftool/dbconfig/20240219-160221-arnaudb.json
  • 15:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T357189)', diff saved to https://phabricator.wikimedia.org/P57105 and previous config saved to /var/cache/conftool/dbconfig/20240219-155936-arnaudb.json
  • 15:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 15:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 15:59 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[2029:2032] (duration: 02m 56s)
  • 15:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 15:57 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 15:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T357189)', diff saved to https://phabricator.wikimedia.org/P57104 and previous config saved to /var/cache/conftool/dbconfig/20240219-155702-arnaudb.json
  • 15:56 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[2029:2032]
  • 15:55 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[1027:1030] (duration: 04m 11s)
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57103 and previous config saved to /var/cache/conftool/dbconfig/20240219-155443-root.json
  • 15:51 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[1027:1030]
  • 15:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 1%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57102 and previous config saved to /var/cache/conftool/dbconfig/20240219-154904-arnaudb.json
  • 15:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 75%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57101 and previous config saved to /var/cache/conftool/dbconfig/20240219-154716-arnaudb.json
  • 15:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P57100 and previous config saved to /var/cache/conftool/dbconfig/20240219-154154-arnaudb.json
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'T343674 - db2194 missing config', diff saved to https://phabricator.wikimedia.org/P57099 and previous config saved to /var/cache/conftool/dbconfig/20240219-154148-arnaudb.json
  • 15:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1213.eqiad.wmnet with OS bookworm
  • 15:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57098 and previous config saved to /var/cache/conftool/dbconfig/20240219-153938-root.json
  • 15:37 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[2025:2028] (duration: 01m 28s)
  • 15:36 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[2025:2028]
  • 15:35 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase1026 (duration: 01m 55s)
  • 15:33 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase1026
  • 15:33 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[1023:1025] (duration: 01m 57s)
  • 15:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 50%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57097 and previous config saved to /var/cache/conftool/dbconfig/20240219-153211-arnaudb.json
  • 15:31 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[1023:1025]
  • 15:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P57096 and previous config saved to /var/cache/conftool/dbconfig/20240219-152634-arnaudb.json
  • 15:24 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase2024 (duration: 01m 24s)
  • 15:23 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:22 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase2024
  • 15:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Increase move rate limit for extendedmovers in arwiki to 16/60 (T357229) (duration: 24m 34s)
  • 15:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1213.eqiad.wmnet with reason: host reimage
  • 15:22 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 30s)
  • 15:20 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 15:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1213.eqiad.wmnet with reason: host reimage
  • 15:19 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 54s)
  • 15:17 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 15:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 40%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57095 and previous config saved to /var/cache/conftool/dbconfig/20240219-151706-arnaudb.json
  • 15:15 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 55s)
  • 15:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 15:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and gergesshamon: Continuing with sync
  • 15:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 15:13 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 15:13 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 28s)
  • 15:12 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57094 and previous config saved to /var/cache/conftool/dbconfig/20240219-151246-root.json
  • 15:11 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 15:11 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 55s)
  • 15:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T357189)', diff saved to https://phabricator.wikimedia.org/P57093 and previous config saved to /var/cache/conftool/dbconfig/20240219-151127-arnaudb.json
  • 15:10 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 15:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 15:09 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 14:53 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 24s)
  • 14:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P57087 and previous config saved to /var/cache/conftool/dbconfig/20240219-145251-arnaudb.json
  • 14:51 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 14:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P57086 and previous config saved to /var/cache/conftool/dbconfig/20240219-145119-ladsgroup.json
  • 14:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 14:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 14:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P57085 and previous config saved to /var/cache/conftool/dbconfig/20240219-145057-ladsgroup.json
  • 14:49 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 51s)
  • 14:48 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 14:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 20%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57084 and previous config saved to /var/cache/conftool/dbconfig/20240219-144655-arnaudb.json
  • 14:44 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57083 and previous config saved to /var/cache/conftool/dbconfig/20240219-144422-root.json
  • 14:42 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57082 and previous config saved to /var/cache/conftool/dbconfig/20240219-144237-root.json
  • 14:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P57081 and previous config saved to /var/cache/conftool/dbconfig/20240219-143744-arnaudb.json
  • 14:37 reedy@deploy2002: Finished scap: Fix casing of MediaWiki (duration: 09m 11s)
  • 14:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P57080 and previous config saved to /var/cache/conftool/dbconfig/20240219-143550-ladsgroup.json
  • 14:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 10%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57079 and previous config saved to /var/cache/conftool/dbconfig/20240219-143150-arnaudb.json
  • 14:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 100%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57078 and previous config saved to /var/cache/conftool/dbconfig/20240219-143145-arnaudb.json
  • 14:29 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57077 and previous config saved to /var/cache/conftool/dbconfig/20240219-142917-root.json
  • 14:28 reedy@deploy2002: Started scap: Fix casing of MediaWiki
  • 14:27 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57076 and previous config saved to /var/cache/conftool/dbconfig/20240219-142732-root.json
  • 14:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T357189)', diff saved to https://phabricator.wikimedia.org/P57075 and previous config saved to /var/cache/conftool/dbconfig/20240219-142238-arnaudb.json
  • 14:20 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P57074 and previous config saved to /var/cache/conftool/dbconfig/20240219-142044-ladsgroup.json
  • 14:19 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:19 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T357189)', diff saved to https://phabricator.wikimedia.org/P57073 and previous config saved to /var/cache/conftool/dbconfig/20240219-141919-arnaudb.json
  • 14:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 14:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 14:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T357189)', diff saved to https://phabricator.wikimedia.org/P57072 and previous config saved to /var/cache/conftool/dbconfig/20240219-141858-arnaudb.json
  • 14:18 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:18 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:18 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 75%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57071 and previous config saved to /var/cache/conftool/dbconfig/20240219-141640-arnaudb.json
  • 14:14 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57070 and previous config saved to /var/cache/conftool/dbconfig/20240219-141412-root.json
  • 14:12 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57069 and previous config saved to /var/cache/conftool/dbconfig/20240219-141227-root.json
  • 14:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P57068 and previous config saved to /var/cache/conftool/dbconfig/20240219-140538-ladsgroup.json
  • 14:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P57067 and previous config saved to /var/cache/conftool/dbconfig/20240219-140351-arnaudb.json
  • 14:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 50%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57066 and previous config saved to /var/cache/conftool/dbconfig/20240219-140135-arnaudb.json
  • 13:59 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57065 and previous config saved to /var/cache/conftool/dbconfig/20240219-135907-root.json
  • 13:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1027.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1027.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1026.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1026.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1024.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1024.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1023.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:57 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1023.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:57 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57064 and previous config saved to /var/cache/conftool/dbconfig/20240219-135722-root.json
  • 13:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P57063 and previous config saved to /var/cache/conftool/dbconfig/20240219-134845-arnaudb.json
  • 13:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1020', diff saved to https://phabricator.wikimedia.org/P57062 and previous config saved to /var/cache/conftool/dbconfig/20240219-134804-root.json
  • 13:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 40%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57061 and previous config saved to /var/cache/conftool/dbconfig/20240219-134630-arnaudb.json
  • 13:45 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1021 to es4 primary ', diff saved to https://phabricator.wikimedia.org/P57060 and previous config saved to /var/cache/conftool/dbconfig/20240219-134551-root.json
  • 13:44 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57059 and previous config saved to /var/cache/conftool/dbconfig/20240219-134402-root.json
  • 13:43 marostegui: Starting es4 eqiad failover from es1020 to es1021 - T357904
  • 13:42 marostegui@cumin1002: dbctl commit (dc=all): 'Change weight of es1021', diff saved to https://phabricator.wikimedia.org/P57058 and previous config saved to /var/cache/conftool/dbconfig/20240219-134205-root.json
  • 13:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: es4 switchover T357904
  • 13:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: es4 switchover T357904
  • 13:38 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1021.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:37 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1021.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:36 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1020.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:35 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1020.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T357189)', diff saved to https://phabricator.wikimedia.org/P57057 and previous config saved to /var/cache/conftool/dbconfig/20240219-133339-arnaudb.json
  • 13:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1021', diff saved to https://phabricator.wikimedia.org/P57056 and previous config saved to /var/cache/conftool/dbconfig/20240219-133245-root.json
  • 13:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 30%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57055 and previous config saved to /var/cache/conftool/dbconfig/20240219-133125-arnaudb.json
  • 13:30 moritzm: installing runc security updates on buster
  • 13:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T357189)', diff saved to https://phabricator.wikimedia.org/P57054 and previous config saved to /var/cache/conftool/dbconfig/20240219-133019-arnaudb.json
  • 13:30 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 13:30 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 13:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T357189)', diff saved to https://phabricator.wikimedia.org/P57053 and previous config saved to /var/cache/conftool/dbconfig/20240219-132958-arnaudb.json
  • 13:28 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57052 and previous config saved to /var/cache/conftool/dbconfig/20240219-132858-root.json
  • 13:26 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1025.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:26 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1025.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:17 marostegui@cumin1002: dbctl commit (dc=all): 'Add db2170 depooled', diff saved to https://phabricator.wikimedia.org/P57051 and previous config saved to /var/cache/conftool/dbconfig/20240219-131729-marostegui.json
  • 13:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1022.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1022.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 20%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57050 and previous config saved to /var/cache/conftool/dbconfig/20240219-131620-arnaudb.json
  • 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db21170 multi-instance', diff saved to https://phabricator.wikimedia.org/P57049 and previous config saved to /var/cache/conftool/dbconfig/20240219-131609-marostegui.json
  • 13:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P57048 and previous config saved to /var/cache/conftool/dbconfig/20240219-131452-arnaudb.json
  • 13:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2170 T354826', diff saved to https://phabricator.wikimedia.org/P57047 and previous config saved to /var/cache/conftool/dbconfig/20240219-131245-root.json
  • 13:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 10%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57046 and previous config saved to /var/cache/conftool/dbconfig/20240219-130116-arnaudb.json
  • 12:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P57045 and previous config saved to /var/cache/conftool/dbconfig/20240219-125945-arnaudb.json
  • 12:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57044 and previous config saved to /var/cache/conftool/dbconfig/20240219-125456-root.json
  • 12:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T357189)', diff saved to https://phabricator.wikimedia.org/P57043 and previous config saved to /var/cache/conftool/dbconfig/20240219-124439-arnaudb.json
  • 12:44 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:43 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:43 hnowlan: migrating refreshLinks to k8s jobrunners
  • 12:42 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:42 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T357189)', diff saved to https://phabricator.wikimedia.org/P57042 and previous config saved to /var/cache/conftool/dbconfig/20240219-124115-arnaudb.json
  • 12:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 12:41 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 12:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T357189)', diff saved to https://phabricator.wikimedia.org/P57041 and previous config saved to /var/cache/conftool/dbconfig/20240219-124054-arnaudb.json
  • 12:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57040 and previous config saved to /var/cache/conftool/dbconfig/20240219-123951-root.json
  • 12:37 aborrero@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1032
  • 12:37 aborrero@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1032
  • 12:36 hnowlan@deploy2002: helmfile [codfw] [canary] DONE helmfile.d/services/mw-jobrunner : sync
  • 12:36 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 12:36 hnowlan@deploy2002: helmfile [codfw] [canary] START helmfile.d/services/mw-jobrunner : sync
  • 12:36 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
  • 12:35 aborrero@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1032
  • 12:35 aborrero@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1032
  • 12:35 hnowlan@deploy2002: helmfile [eqiad] [canary] DONE helmfile.d/services/mw-jobrunner : sync
  • 12:35 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 12:35 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 12:35 hnowlan@deploy2002: helmfile [eqiad] [canary] START helmfile.d/services/mw-jobrunner : sync
  • 12:32 samtar@deploy2002: Finished scap: Backport for IS/CS: Add wmgEditRecoveryDefaultUserOptions (T350653) (duration: 10m 21s)
  • 12:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P57039 and previous config saved to /var/cache/conftool/dbconfig/20240219-122547-arnaudb.json
  • 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57038 and previous config saved to /var/cache/conftool/dbconfig/20240219-122446-root.json
  • 12:24 samtar@deploy2002: samtar: Continuing with sync
  • 12:23 samtar@deploy2002: samtar: Backport for IS/CS: Add wmgEditRecoveryDefaultUserOptions (T350653) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:21 samtar@deploy2002: Started scap: Backport for IS/CS: Add wmgEditRecoveryDefaultUserOptions (T350653)
  • 12:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57037 and previous config saved to /var/cache/conftool/dbconfig/20240219-122142-root.json
  • 12:19 samtar@deploy2002: backport Cancelled
  • 12:18 samtar@deploy2002: backport Cancelled
  • 12:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P57035 and previous config saved to /var/cache/conftool/dbconfig/20240219-121040-arnaudb.json
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57034 and previous config saved to /var/cache/conftool/dbconfig/20240219-120951-root.json
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57033 and previous config saved to /var/cache/conftool/dbconfig/20240219-120941-root.json
  • 12:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57032 and previous config saved to /var/cache/conftool/dbconfig/20240219-120637-root.json
  • 12:03 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1032.eqiad.wmnet with OS bookworm
  • 11:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T357189)', diff saved to https://phabricator.wikimedia.org/P57031 and previous config saved to /var/cache/conftool/dbconfig/20240219-115534-arnaudb.json
  • 11:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57030 and previous config saved to /var/cache/conftool/dbconfig/20240219-115439-root.json
  • 11:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57029 and previous config saved to /var/cache/conftool/dbconfig/20240219-115436-root.json
  • 11:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57028 and previous config saved to /var/cache/conftool/dbconfig/20240219-115435-root.json
  • 11:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T357189)', diff saved to https://phabricator.wikimedia.org/P57027 and previous config saved to /var/cache/conftool/dbconfig/20240219-115210-arnaudb.json
  • 11:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 11:51 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 11:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T357189)', diff saved to https://phabricator.wikimedia.org/P57026 and previous config saved to /var/cache/conftool/dbconfig/20240219-115138-arnaudb.json
  • 11:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57025 and previous config saved to /var/cache/conftool/dbconfig/20240219-115132-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57024 and previous config saved to /var/cache/conftool/dbconfig/20240219-113934-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57023 and previous config saved to /var/cache/conftool/dbconfig/20240219-113931-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57022 and previous config saved to /var/cache/conftool/dbconfig/20240219-113931-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'Place db2138 in s2 T354826', diff saved to https://phabricator.wikimedia.org/P57021 and previous config saved to /var/cache/conftool/dbconfig/20240219-113926-marostegui.json
  • 11:37 ariel@deploy2002: Finished deploy [dumps/dumps@0d1f9be]: improvements to page content history backfill script (duration: 00m 04s)
  • 11:37 ariel@deploy2002: Started deploy [dumps/dumps@0d1f9be]: improvements to page content history backfill script
  • 11:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P57020 and previous config saved to /var/cache/conftool/dbconfig/20240219-113632-arnaudb.json
  • 11:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57019 and previous config saved to /var/cache/conftool/dbconfig/20240219-113627-root.json
  • 11:36 marostegui@cumin1002: dbctl commit (dc=all): 'place db2138 in s2', diff saved to https://phabricator.wikimedia.org/P57018 and previous config saved to /var/cache/conftool/dbconfig/20240219-113622-marostegui.json
  • 11:34 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1032.eqiad.wmnet with reason: host reimage
  • 11:28 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1032.eqiad.wmnet with reason: host reimage
  • 11:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2138 T354826', diff saved to https://phabricator.wikimedia.org/P57017 and previous config saved to /var/cache/conftool/dbconfig/20240219-112405-root.json
  • 11:23 taavi: update cr*-codfw firewall policy for puppetmaster2003 -> puppetserver2003 rename
  • 11:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57016 and previous config saved to /var/cache/conftool/dbconfig/20240219-112311-root.json
  • 11:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57015 and previous config saved to /var/cache/conftool/dbconfig/20240219-112256-root.json
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57014 and previous config saved to /var/cache/conftool/dbconfig/20240219-112030-root.json
  • 11:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P57013 and previous config saved to /var/cache/conftool/dbconfig/20240219-111819-arnaudb.json
  • 11:11 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1032.eqiad.wmnet with OS bookworm
  • 11:10 claime: sudo cumin -b 20 -p 95 '*' 'run-puppet-agent -q --failed-only'
  • 11:09 claime: Running puppet on failed nodes
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57012 and previous config saved to /var/cache/conftool/dbconfig/20240219-110806-root.json
  • 11:08 claime: puppetserver roll-restart done
  • 11:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57011 and previous config saved to /var/cache/conftool/dbconfig/20240219-110751-root.json
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57010 and previous config saved to /var/cache/conftool/dbconfig/20240219-110525-root.json
  • 11:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T357189)', diff saved to https://phabricator.wikimedia.org/P57009 and previous config saved to /var/cache/conftool/dbconfig/20240219-110312-arnaudb.json
  • 11:00 claime: sudo cumin -s 10 -b 1 A:puppetserver 'systemctl restart puppetserver.service'
  • 11:00 claime: roll-restarting puppetserver
  • 10:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T357189)', diff saved to https://phabricator.wikimedia.org/P57008 and previous config saved to /var/cache/conftool/dbconfig/20240219-105949-arnaudb.json
  • 10:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 10:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 10:56 claime: restarting puppetserver on puppetserver1001
  • 10:54 godog: bounce thanos-query on titan1* - T356788
  • 10:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57007 and previous config saved to /var/cache/conftool/dbconfig/20240219-105302-root.json
  • 10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57006 and previous config saved to /var/cache/conftool/dbconfig/20240219-105246-root.json
  • 10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 into s4, depooled', diff saved to https://phabricator.wikimedia.org/P57005 and previous config saved to /var/cache/conftool/dbconfig/20240219-105211-marostegui.json
  • 10:48 godog: bounce thanos-query on titan2* - T356788
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'Place db2137 in s4 T354826', diff saved to https://phabricator.wikimedia.org/P57004 and previous config saved to /var/cache/conftool/dbconfig/20240219-104556-marostegui.json
  • 10:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2137 T354826', diff saved to https://phabricator.wikimedia.org/P57002 and previous config saved to /var/cache/conftool/dbconfig/20240219-103939-root.json
  • 10:37 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2166.codfw.wmnet onto db2167.codfw.wmnet
  • 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57001 and previous config saved to /var/cache/conftool/dbconfig/20240219-103741-root.json
  • 10:33 cgoubert@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=thanos-query,name=eqiad
  • 10:33 claime: repooling thanos-query eqiad - T356788
  • 10:26 claime: restarting thanos-query.service - titan1001 - T356788
  • 10:22 claime: restarting thanos-query.service - titan1002 - T356788
  • 10:22 cgoubert@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=thanos-query,name=eqiad
  • 10:22 claime: depooling thanos-query eqiad - T356788
  • 10:11 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cloudvirt1032.eqiad.wmnet with reason: reimage
  • 10:11 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cloudvirt1032.eqiad.wmnet with reason: reimage
  • 10:10 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wmcs::openstack::eqiad1::cloudweb
  • 10:10 claime: restarting thanos-query.service - titan1002 - T356788
  • 10:05 claime: restarting thanos-query.service - titan1001 - T356788
  • 10:04 claime: restarting thanos-query.service - titan1001
  • 10:02 taavi@cumin1002: START - Cookbook sre.puppet.migrate-role for role: wmcs::openstack::eqiad1::cloudweb
  • 09:59 taavi@cumin1002: conftool action : set/pooled=yes; selector: name=cloudweb1004.wikimedia.org
  • 09:55 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudweb1004.wikimedia.org with OS bullseye
  • 09:49 claime: Draining mw2442 - failed RAID - T357380
  • 09:27 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudweb1004.wikimedia.org with reason: host reimage
  • 09:24 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudweb1004.wikimedia.org with reason: host reimage
  • 09:12 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudweb1004.wikimedia.org with OS bullseye
  • 09:10 moritzm: installing gnutls28 security updates on bookworm
  • 09:06 taavi@cumin1002: conftool action : set/pooled=inactive; selector: name=cloudweb1004.wikimedia.org
  • 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57000 and previous config saved to /var/cache/conftool/dbconfig/20240219-090600-root.json
  • 09:01 ladsgroup@deploy2002: Finished scap: Backport for Set fawiki to read new in pagelinks (T351237) (duration: 09m 43s)
  • 08:54 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 08:53 ladsgroup@deploy2002: ladsgroup: Backport for Set fawiki to read new in pagelinks (T351237) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:51 ladsgroup@deploy2002: Started scap: Backport for Set fawiki to read new in pagelinks (T351237)
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56999 and previous config saved to /var/cache/conftool/dbconfig/20240219-085055-root.json
  • 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56998 and previous config saved to /var/cache/conftool/dbconfig/20240219-083840-root.json
  • 08:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56997 and previous config saved to /var/cache/conftool/dbconfig/20240219-083550-root.json
  • 08:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 08:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 08:25 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db2166.codfw.wmnet onto db2167.codfw.wmnet
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56996 and previous config saved to /var/cache/conftool/dbconfig/20240219-082336-root.json
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2166 T354826', diff saved to https://phabricator.wikimedia.org/P56995 and previous config saved to /var/cache/conftool/dbconfig/20240219-082321-root.json
  • 08:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 08:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 08:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 08:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56994 and previous config saved to /var/cache/conftool/dbconfig/20240219-082121-root.json
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56993 and previous config saved to /var/cache/conftool/dbconfig/20240219-082045-root.json
  • 08:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P56992 and previous config saved to /var/cache/conftool/dbconfig/20240219-081920-ladsgroup.json
  • 08:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 08:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 08:16 moritzm: installing runc security updates on buster
  • 08:11 marostegui@cumin1002: dbctl commit (dc=all): 'Place db2167 in s8 T354826', diff saved to https://phabricator.wikimedia.org/P56991 and previous config saved to /var/cache/conftool/dbconfig/20240219-081132-marostegui.json
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56990 and previous config saved to /var/cache/conftool/dbconfig/20240219-080831-root.json
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2167 multiinstance', diff saved to https://phabricator.wikimedia.org/P56989 and previous config saved to /var/cache/conftool/dbconfig/20240219-080744-marostegui.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56988 and previous config saved to /var/cache/conftool/dbconfig/20240219-080616-root.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56987 and previous config saved to /var/cache/conftool/dbconfig/20240219-080612-root.json
  • 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56986 and previous config saved to /var/cache/conftool/dbconfig/20240219-080540-root.json
  • 08:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2167 T354826', diff saved to https://phabricator.wikimedia.org/P56985 and previous config saved to /var/cache/conftool/dbconfig/20240219-080322-root.json
  • 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56984 and previous config saved to /var/cache/conftool/dbconfig/20240219-075325-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56983 and previous config saved to /var/cache/conftool/dbconfig/20240219-075111-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56982 and previous config saved to /var/cache/conftool/dbconfig/20240219-075107-root.json
  • 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56981 and previous config saved to /var/cache/conftool/dbconfig/20240219-075035-root.json
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Place db2168 in s7 T354826', diff saved to https://phabricator.wikimedia.org/P56980 and previous config saved to /var/cache/conftool/dbconfig/20240219-074609-marostegui.json
  • 07:44 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2168 multiinstance', diff saved to https://phabricator.wikimedia.org/P56979 and previous config saved to /var/cache/conftool/dbconfig/20240219-074450-marostegui.json
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2168 T354826', diff saved to https://phabricator.wikimedia.org/P56978 and previous config saved to /var/cache/conftool/dbconfig/20240219-074148-root.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56977 and previous config saved to /var/cache/conftool/dbconfig/20240219-073820-root.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56976 and previous config saved to /var/cache/conftool/dbconfig/20240219-073606-root.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56975 and previous config saved to /var/cache/conftool/dbconfig/20240219-073602-root.json
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56974 and previous config saved to /var/cache/conftool/dbconfig/20240219-073521-root.json
  • 07:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56973 and previous config saved to /var/cache/conftool/dbconfig/20240219-072315-root.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56972 and previous config saved to /var/cache/conftool/dbconfig/20240219-072101-root.json
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56971 and previous config saved to /var/cache/conftool/dbconfig/20240219-072057-root.json
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56970 and previous config saved to /var/cache/conftool/dbconfig/20240219-072016-root.json
  • 07:17 marostegui@cumin1002: dbctl commit (dc=all): 'Place db1213 in s5 T354826', diff saved to https://phabricator.wikimedia.org/P56969 and previous config saved to /var/cache/conftool/dbconfig/20240219-071658-marostegui.json
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1213 multiinstance', diff saved to https://phabricator.wikimedia.org/P56968 and previous config saved to /var/cache/conftool/dbconfig/20240219-071604-marostegui.json
  • 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1213 T354826', diff saved to https://phabricator.wikimedia.org/P56967 and previous config saved to /var/cache/conftool/dbconfig/20240219-070815-root.json
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56966 and previous config saved to /var/cache/conftool/dbconfig/20240219-070556-root.json
  • 07:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56965 and previous config saved to /var/cache/conftool/dbconfig/20240219-070552-root.json
  • 07:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56964 and previous config saved to /var/cache/conftool/dbconfig/20240219-070511-root.json
  • 07:02 marostegui@cumin1002: dbctl commit (dc=all): 'Place db1246 in s2 T354826', diff saved to https://phabricator.wikimedia.org/P56963 and previous config saved to /var/cache/conftool/dbconfig/20240219-070212-marostegui.json
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1246 multiinstance', diff saved to https://phabricator.wikimedia.org/P56962 and previous config saved to /var/cache/conftool/dbconfig/20240219-065848-marostegui.json
  • 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1246 T354826', diff saved to https://phabricator.wikimedia.org/P56961 and previous config saved to /var/cache/conftool/dbconfig/20240219-065456-root.json
  • 06:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56960 and previous config saved to /var/cache/conftool/dbconfig/20240219-065048-root.json
  • 06:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56959 and previous config saved to /var/cache/conftool/dbconfig/20240219-065007-root.json
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Place db1244 in s4 T354826', diff saved to https://phabricator.wikimedia.org/P56958 and previous config saved to /var/cache/conftool/dbconfig/20240219-064350-marostegui.json
  • 06:41 marostegui@cumin1002: dbctl commit (dc=all): 'Place db1244 in s4 T354826', diff saved to https://phabricator.wikimedia.org/P56957 and previous config saved to /var/cache/conftool/dbconfig/20240219-064157-marostegui.json
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56956 and previous config saved to /var/cache/conftool/dbconfig/20240219-063502-root.json
  • 06:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1244 T354826', diff saved to https://phabricator.wikimedia.org/P56955 and previous config saved to /var/cache/conftool/dbconfig/20240219-063457-root.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56954 and previous config saved to /var/cache/conftool/dbconfig/20240219-061957-root.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Place db1170 in s7 T354826', diff saved to https://phabricator.wikimedia.org/P56953 and previous config saved to /var/cache/conftool/dbconfig/20240219-061919-marostegui.json
  • 06:17 marostegui@deploy2002: Finished scap: Backport for Revert "ProductionServices.php: Promote pc2014 to pc1 master" (duration: 19m 02s)
  • 06:15 marostegui@cumin1002: dbctl commit (dc=all): 'Place db1170 in s7 T354826', diff saved to https://phabricator.wikimedia.org/P56952 and previous config saved to /var/cache/conftool/dbconfig/20240219-061548-marostegui.json
  • 06:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1170 T354826', diff saved to https://phabricator.wikimedia.org/P56951 and previous config saved to /var/cache/conftool/dbconfig/20240219-061121-root.json
  • 06:08 marostegui@deploy2002: marostegui: Continuing with sync
  • 06:08 marostegui@deploy2002: marostegui: Backport for Revert "ProductionServices.php: Promote pc2014 to pc1 master" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 05:58 marostegui@deploy2002: Started scap: Backport for Revert "ProductionServices.php: Promote pc2014 to pc1 master"
  • 05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc[2011,2014].codfw.wmnet,pc[1011,1014].eqiad.wmnet with reason: Primary switchover pc1 T356371
  • 05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on pc[2011,2014].codfw.wmnet,pc[1011,1014].eqiad.wmnet with reason: Primary switchover pc1 T356371

2024-02-18

  • 23:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P56950 and previous config saved to /var/cache/conftool/dbconfig/20240218-231102-ladsgroup.json
  • 22:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P56949 and previous config saved to /var/cache/conftool/dbconfig/20240218-225556-ladsgroup.json
  • 22:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P56948 and previous config saved to /var/cache/conftool/dbconfig/20240218-224049-ladsgroup.json
  • 22:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P56947 and previous config saved to /var/cache/conftool/dbconfig/20240218-222543-ladsgroup.json
  • 21:10 eileen: civicrm upgraded from 45a0138c to 5af300d4
  • 17:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P56945 and previous config saved to /var/cache/conftool/dbconfig/20240218-171526-ladsgroup.json
  • 17:15 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 17:15 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 17:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P56944 and previous config saved to /var/cache/conftool/dbconfig/20240218-171502-ladsgroup.json
  • 16:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P56943 and previous config saved to /var/cache/conftool/dbconfig/20240218-165955-ladsgroup.json
  • 16:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P56942 and previous config saved to /var/cache/conftool/dbconfig/20240218-164448-ladsgroup.json
  • 16:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P56941 and previous config saved to /var/cache/conftool/dbconfig/20240218-162942-ladsgroup.json
  • 11:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P56940 and previous config saved to /var/cache/conftool/dbconfig/20240218-111954-ladsgroup.json
  • 11:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 11:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 11:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P56939 and previous config saved to /var/cache/conftool/dbconfig/20240218-111915-ladsgroup.json
  • 11:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P56938 and previous config saved to /var/cache/conftool/dbconfig/20240218-110408-ladsgroup.json
  • 10:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P56937 and previous config saved to /var/cache/conftool/dbconfig/20240218-104901-ladsgroup.json
  • 10:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P56936 and previous config saved to /var/cache/conftool/dbconfig/20240218-103355-ladsgroup.json
  • 09:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P56935 and previous config saved to /var/cache/conftool/dbconfig/20240218-093323-ladsgroup.json
  • 09:33 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 09:33 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 09:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T352010)', diff saved to https://phabricator.wikimedia.org/P56934 and previous config saved to /var/cache/conftool/dbconfig/20240218-093301-ladsgroup.json
  • 09:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P56933 and previous config saved to /var/cache/conftool/dbconfig/20240218-091754-ladsgroup.json
  • 09:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P56932 and previous config saved to /var/cache/conftool/dbconfig/20240218-090248-ladsgroup.json
  • 08:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T352010)', diff saved to https://phabricator.wikimedia.org/P56931 and previous config saved to /var/cache/conftool/dbconfig/20240218-084741-ladsgroup.json
  • 03:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2140 (T352010)', diff saved to https://phabricator.wikimedia.org/P56930 and previous config saved to /var/cache/conftool/dbconfig/20240218-035542-ladsgroup.json
  • 03:55 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 03:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance

2024-02-17

  • 23:42 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 23:42 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 23:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56929 and previous config saved to /var/cache/conftool/dbconfig/20240217-234216-ladsgroup.json
  • 23:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P56928 and previous config saved to /var/cache/conftool/dbconfig/20240217-232709-ladsgroup.json
  • 23:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P56927 and previous config saved to /var/cache/conftool/dbconfig/20240217-231203-ladsgroup.json
  • 22:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56926 and previous config saved to /var/cache/conftool/dbconfig/20240217-225656-ladsgroup.json
  • 17:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56925 and previous config saved to /var/cache/conftool/dbconfig/20240217-175100-ladsgroup.json
  • 17:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 17:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 17:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56924 and previous config saved to /var/cache/conftool/dbconfig/20240217-175038-ladsgroup.json
  • 17:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P56923 and previous config saved to /var/cache/conftool/dbconfig/20240217-173531-ladsgroup.json
  • 17:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P56922 and previous config saved to /var/cache/conftool/dbconfig/20240217-172024-ladsgroup.json
  • 17:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56921 and previous config saved to /var/cache/conftool/dbconfig/20240217-170518-ladsgroup.json
  • 11:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56920 and previous config saved to /var/cache/conftool/dbconfig/20240217-115446-ladsgroup.json
  • 11:54 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 11:54 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 11:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P56919 and previous config saved to /var/cache/conftool/dbconfig/20240217-115422-ladsgroup.json
  • 11:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P56918 and previous config saved to /var/cache/conftool/dbconfig/20240217-113916-ladsgroup.json
  • 11:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P56917 and previous config saved to /var/cache/conftool/dbconfig/20240217-112409-ladsgroup.json
  • 11:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P56916 and previous config saved to /var/cache/conftool/dbconfig/20240217-110903-ladsgroup.json
  • 10:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P56915 and previous config saved to /var/cache/conftool/dbconfig/20240217-100830-ladsgroup.json
  • 10:08 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 10:08 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 10:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P56914 and previous config saved to /var/cache/conftool/dbconfig/20240217-100809-ladsgroup.json
  • 09:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P56913 and previous config saved to /var/cache/conftool/dbconfig/20240217-095302-ladsgroup.json
  • 09:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P56912 and previous config saved to /var/cache/conftool/dbconfig/20240217-093755-ladsgroup.json
  • 09:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P56911 and previous config saved to /var/cache/conftool/dbconfig/20240217-092249-ladsgroup.json
  • 08:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P56910 and previous config saved to /var/cache/conftool/dbconfig/20240217-082217-ladsgroup.json
  • 08:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 08:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 08:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P56909 and previous config saved to /var/cache/conftool/dbconfig/20240217-082155-ladsgroup.json
  • 08:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P56908 and previous config saved to /var/cache/conftool/dbconfig/20240217-080649-ladsgroup.json
  • 07:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P56907 and previous config saved to /var/cache/conftool/dbconfig/20240217-075142-ladsgroup.json
  • 07:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P56906 and previous config saved to /var/cache/conftool/dbconfig/20240217-073636-ladsgroup.json
  • 02:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P56905 and previous config saved to /var/cache/conftool/dbconfig/20240217-022159-ladsgroup.json
  • 02:21 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 02:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 02:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P56904 and previous config saved to /var/cache/conftool/dbconfig/20240217-022137-ladsgroup.json
  • 02:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P56903 and previous config saved to /var/cache/conftool/dbconfig/20240217-020630-ladsgroup.json
  • 01:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P56902 and previous config saved to /var/cache/conftool/dbconfig/20240217-015123-ladsgroup.json
  • 01:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P56901 and previous config saved to /var/cache/conftool/dbconfig/20240217-013617-ladsgroup.json

2024-02-16

  • 21:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2205.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2205.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2204.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2204.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:41 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 21:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2204.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2205.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2203.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:40 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2205.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:40 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2204.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:40 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2203.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:39 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2203 to codfw - jhancock@cumin2002"
  • 21:38 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2203 to codfw - jhancock@cumin2002"
  • 21:35 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 21:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2201.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2202.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2202.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2201.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:58 tzatziki: removing 2 files for legal compliance
  • 20:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T352010)', diff saved to https://phabricator.wikimedia.org/P56900 and previous config saved to /var/cache/conftool/dbconfig/20240216-204746-ladsgroup.json
  • 20:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P56899 and previous config saved to /var/cache/conftool/dbconfig/20240216-204709-ladsgroup.json
  • 20:38 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:38 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:35 hashar@deploy2002: Finished deploy [integration/docroot@7a9d46f]: build: Upgrade mediawiki/mediawiki-codesniffer to v43.0.0 (duration: 00m 07s)
  • 20:35 hashar@deploy2002: Started deploy [integration/docroot@7a9d46f]: build: Upgrade mediawiki/mediawiki-codesniffer to v43.0.0
  • 20:33 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P56898 and previous config saved to /var/cache/conftool/dbconfig/20240216-203202-ladsgroup.json
  • 20:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:23 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:22 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 20:17 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P56897 and previous config saved to /var/cache/conftool/dbconfig/20240216-201656-ladsgroup.json
  • 20:16 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2202.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:16 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:15 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 20:12 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P56896 and previous config saved to /var/cache/conftool/dbconfig/20240216-201239-ladsgroup.json
  • 20:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 20:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 20:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P56895 and previous config saved to /var/cache/conftool/dbconfig/20240216-200149-ladsgroup.json
  • 19:56 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:55 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2201.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:52 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2201.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:51 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:50 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 19:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:45 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:45 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2201.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2202.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2201.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:44 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:44 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2200 to codfw - jhancock@cumin2002"
  • 19:43 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2200 to codfw - jhancock@cumin2002"
  • 19:41 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 19:08 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:07 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:50 jdrewniak@deploy2002: Finished scap: Backport for dd elements should have no margin (T357742) (duration: 14m 04s)
  • 17:43 jdrewniak@deploy2002: jdrewniak and kemayo: Continuing with sync
  • 17:37 jdrewniak@deploy2002: jdrewniak and kemayo: Backport for dd elements should have no margin (T357742) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:36 jdrewniak@deploy2002: Started scap: Backport for dd elements should have no margin (T357742)
  • 17:09 jdrewniak@deploy2002: Finished scap: Backport for Mitigates font size issues (T357724) (duration: 10m 04s)
  • 17:02 jdrewniak@deploy2002: jdrewniak and jdlrobson: Continuing with sync
  • 17:02 jdrewniak@deploy2002: jdrewniak and jdlrobson: Backport for Mitigates font size issues (T357724) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:59 jdrewniak@deploy2002: Started scap: Backport for Mitigates font size issues (T357724)
  • 16:53 bking@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 16:53 bking@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 16:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2199.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:39 mvernon@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-be[1044-1050].eqiad.wmnet
  • 16:39 mvernon@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:39 mvernon@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-be[1044-1050].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin1002"
  • 16:36 mvernon@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-be[1044-1050].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin1002"
  • 16:34 mvernon@cumin1002: START - Cookbook sre.dns.netbox
  • 16:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2199.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:18 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2199 to codfw - jhancock@cumin2002"
  • 16:17 bking@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 16:17 bking@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 16:16 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2199 to codfw - jhancock@cumin2002"
  • 16:12 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 16:05 mvernon@cumin1002: START - Cookbook sre.hosts.decommission for hosts ms-be[1044-1050].eqiad.wmnet
  • 16:04 ejegg: fundraising civicrm upgraded from 84ba0ccf to 45a0138c
  • 16:01 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1047
  • 15:53 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic100[1-4]* for decom hosts - bking@cumin2002 - T357780
  • 15:53 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic100[1-4]* for decom hosts - bking@cumin2002 - T357780
  • 15:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1047.eqiad.wmnet
  • 15:35 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1349.eqiad.wmnet|mw1367.eqiad.wmnet|mw1476.eqiad.wmnet|mw1477.eqiad.wmnet),cluster=kubernetes,service=kubesvc
  • 15:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1047.eqiad.wmnet
  • 15:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be1047.eqiad.wmnet
  • 15:28 hnowlan: running `homer 'cr*eqiad*' commit 'T351074'` for new k8s workers
  • 15:20 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be1047.eqiad.wmnet
  • 15:20 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1047
  • 15:14 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1046
  • 15:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 15:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 14:49 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1046
  • 14:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1046.eqiad.wmnet
  • 14:42 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1046.eqiad.wmnet
  • 14:34 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["73436010"]' | tee -a ~/T315510-enwiki
  • 14:33 Lucas_WMDE: STOP persistRevisionThreadItems.php on enwiki for T315510 again, I forgot to adjust the --start >.<
  • 14:33 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1046
  • 14:32 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["67578461"]' | tee -a ~/T315510-enwiki
  • 14:32 Lucas_WMDE: STOP persistRevisionThreadItems on enwiki for T315510 – for restart on wmf.18; last output: --start '["73436010"]'
  • 14:19 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1005.eqiad.wmnet
  • 14:19 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1005.eqiad.wmnet
  • 14:18 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be1046.eqiad.wmnet
  • 14:08 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be1046.eqiad.wmnet
  • 14:08 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 14:07 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1046
  • 14:07 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 14:07 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 14:06 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 14:06 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 14:06 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:47 hashar@deploy2002: Finished scap: Backport for Revert "Avoid creating a MWReferenceModel if not needed" (T357745) (duration: 13m 24s)
  • 13:39 hashar@deploy2002: matmarex and hashar: Continuing with sync
  • 13:37 hashar@deploy2002: matmarex and hashar: Backport for Revert "Avoid creating a MWReferenceModel if not needed" (T357745) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:37 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1477.eqiad.wmnet with OS bullseye
  • 13:34 hashar@deploy2002: Started scap: Backport for Revert "Avoid creating a MWReferenceModel if not needed" (T357745)
  • 13:26 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1349.eqiad.wmnet with OS bullseye
  • 13:23 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1367.eqiad.wmnet with OS bullseye
  • 13:20 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1476.eqiad.wmnet with OS bullseye
  • 13:20 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1477.eqiad.wmnet with reason: host reimage
  • 13:17 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1477.eqiad.wmnet with reason: host reimage
  • 13:07 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1349.eqiad.wmnet with reason: host reimage
  • 13:05 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1477.eqiad.wmnet with OS bullseye
  • 13:04 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1367.eqiad.wmnet with reason: host reimage
  • 13:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1476.eqiad.wmnet with reason: host reimage
  • 13:00 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1477.eqiad.wmnet with OS bullseye
  • 13:00 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1349.eqiad.wmnet with reason: host reimage
  • 13:00 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1367.eqiad.wmnet with reason: host reimage
  • 12:59 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1476.eqiad.wmnet with reason: host reimage
  • 12:47 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1477.eqiad.wmnet with OS bullseye
  • 12:46 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1476.eqiad.wmnet with OS bullseye
  • 12:46 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1367.eqiad.wmnet with OS bullseye
  • 12:46 taavi: publish docker-registry.discovery.wmnet/python3-bookworm:0.0.1
  • 12:46 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1349.eqiad.wmnet with OS bullseye
  • 12:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P56892 and previous config saved to /var/cache/conftool/dbconfig/20240216-121416-ladsgroup.json
  • 12:14 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 12:13 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 10:58 moritzm: update bullseye/bookworm netboot images on the Puppet 7 volatile environment to the latest point releases (to bring in sync with volatile for Puppet 5) T341056
  • 10:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 10:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 10:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P56891 and previous config saved to /var/cache/conftool/dbconfig/20240216-105041-ladsgroup.json
  • 10:44 volans@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 10:44 volans@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 10:43 volans@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 10:42 volans@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 10:41 hnowlan@cumin2002: conftool action : set/pooled=yes; selector: name=mw2379.codfw.wmnet
  • 10:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P56890 and previous config saved to /var/cache/conftool/dbconfig/20240216-103535-ladsgroup.json
  • 10:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P56889 and previous config saved to /var/cache/conftool/dbconfig/20240216-102028-ladsgroup.json
  • 10:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P56888 and previous config saved to /var/cache/conftool/dbconfig/20240216-100521-ladsgroup.json
  • 10:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2194.codfw.wmnet with reason: Silence for WE
  • 10:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2194.codfw.wmnet with reason: Silence for WE
  • 09:07 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1036.eqiad.wmnet with OS bullseye
  • 09:07 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 09:06 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 08:38 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bullseye
  • 08:07 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['an-redacteddb1001']
  • 08:07 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-redacteddb1001']
  • 06:04 apergos: manually generating 7z files in parallel for wikidata full history dumps run, in screen session, owned by ariel, on snapshot1009
  • 05:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P56887 and previous config saved to /var/cache/conftool/dbconfig/20240216-052044-ladsgroup.json
  • 05:20 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 05:20 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 05:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P56886 and previous config saved to /var/cache/conftool/dbconfig/20240216-052021-ladsgroup.json
  • 05:05 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 05:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P56885 and previous config saved to /var/cache/conftool/dbconfig/20240216-050514-ladsgroup.json
  • 05:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 05:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T352010)', diff saved to https://phabricator.wikimedia.org/P56884 and previous config saved to /var/cache/conftool/dbconfig/20240216-050458-ladsgroup.json
  • 04:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P56883 and previous config saved to /var/cache/conftool/dbconfig/20240216-045008-ladsgroup.json
  • 04:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P56882 and previous config saved to /var/cache/conftool/dbconfig/20240216-044952-ladsgroup.json
  • 04:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P56881 and previous config saved to /var/cache/conftool/dbconfig/20240216-043501-ladsgroup.json
  • 04:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P56880 and previous config saved to /var/cache/conftool/dbconfig/20240216-043445-ladsgroup.json
  • 04:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T352010)', diff saved to https://phabricator.wikimedia.org/P56879 and previous config saved to /var/cache/conftool/dbconfig/20240216-041938-ladsgroup.json
  • 01:26 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 01:08 htriedman@deploy2002: Finished deploy [airflow-dags/platform_eng@d93828e]: (no justification provided) (duration: 00m 28s)
  • 01:07 htriedman@deploy2002: Started deploy [airflow-dags/platform_eng@d93828e]: (no justification provided)
  • 00:49 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 00:28 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 00:27 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad plugin upgrade - ryankemper@cumin2002 - T356651
  • 00:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P56877 and previous config saved to /var/cache/conftool/dbconfig/20240216-001636-ladsgroup.json
  • 00:16 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 00:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 00:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P56876 and previous config saved to /var/cache/conftool/dbconfig/20240216-001612-ladsgroup.json
  • 00:06 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 00:02 thcipriani@deploy2002: Finished scap: Backport for Connection: Correct read-only detection (T354793 T356526) (duration: 10m 28s)
  • 00:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P56875 and previous config saved to /var/cache/conftool/dbconfig/20240216-000106-ladsgroup.json

2024-02-15

  • 23:55 thcipriani@deploy2002: ebernhardson and thcipriani: Continuing with sync
  • 23:53 thcipriani@deploy2002: ebernhardson and thcipriani: Backport for Connection: Correct read-only detection (T354793 T356526) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1005.eqiad.wmnet with OS bullseye
  • 23:52 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 23:52 thcipriani@deploy2002: Started scap: Backport for Connection: Correct read-only detection (T354793 T356526)
  • 23:50 thcipriani@deploy2002: Finished scap: Backport for Add border-collapse to wikitable (T357589) (duration: 11m 31s)
  • 23:46 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 23:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P56874 and previous config saved to /var/cache/conftool/dbconfig/20240215-234600-ladsgroup.json
  • 23:42 thcipriani@deploy2002: thcipriani and jdlrobson: Continuing with sync
  • 23:40 thcipriani@deploy2002: thcipriani and jdlrobson: Backport for Add border-collapse to wikitable (T357589) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:38 thcipriani@deploy2002: Started scap: Backport for Add border-collapse to wikitable (T357589)
  • 23:33 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:33 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:31 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1005.eqiad.wmnet with reason: host reimage
  • 23:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P56873 and previous config saved to /var/cache/conftool/dbconfig/20240215-233053-ladsgroup.json
  • 23:28 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1005.eqiad.wmnet with reason: host reimage
  • 23:26 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 23:18 tzatziki: removing 2 files for legal compliance
  • 23:13 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1005.eqiad.wmnet with OS bullseye
  • 23:09 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1005.eqiad.wmnet with OS bullseye
  • 23:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1005.eqiad.wmnet with OS bullseye
  • 23:02 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1005.eqiad.wmnet with OS bullseye
  • 22:47 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 22:40 vriley@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['an-redacteddb1001']
  • 22:40 vriley@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-redacteddb1001']
  • 22:40 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1005.eqiad.wmnet with OS bullseye
  • 22:38 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host an-redacteddb1001.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:34 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1005
  • 22:34 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "c_f"} and A:aqs and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 22:33 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1005
  • 22:30 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:30 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: migrate cloudelastic1005 to private IPs - bking@cumin2002"
  • 22:29 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: migrate cloudelastic1005 to private IPs - bking@cumin2002"
  • 22:27 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 22:25 wfan: payments-wiki upgraded from 29eb0fff to 709d89bf
  • 22:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudelastic1005.wikimedia.org
  • 22:19 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:19 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1005.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 22:16 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1005.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 22:12 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 22:08 vriley@cumin1002: START - Cookbook sre.hosts.provision for host an-redacteddb1001.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:05 bking@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic1005.wikimedia.org
  • 22:05 brennen@deploy2002: Finished scap: Backport for Filter out null external link attributes (T357668) (duration: 11m 40s)
  • 22:03 bking@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=cloudelastic1006\.eqiad\.wmnet
  • 22:00 vriley@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 21:59 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "c_f"} and A:aqs and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 21:59 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "b_e"} and A:aqs and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 21:59 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad plugin upgrade - ryankemper@cumin2002 - T356651
  • 21:58 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 21:57 brennen@deploy2002: brennen: Continuing with sync
  • 21:56 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw plugin upgrade - ryankemper@cumin2002 - T356651
  • 21:54 brennen@deploy2002: brennen: Backport for Filter out null external link attributes (T357668) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:53 brennen@deploy2002: Started scap: Backport for Filter out null external link attributes (T357668)
  • 21:52 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic1005* for IP migration - bking@cumin2002 - T355617
  • 21:52 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1005* for IP migration - bking@cumin2002 - T355617
  • 21:51 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in cloudelastic
  • 21:51 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in cloudelastic
  • 21:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1006.eqiad.wmnet with OS bullseye
  • 21:28 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 21:26 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "b_e"} and A:aqs and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 21:21 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.18 refs T354436
  • 21:20 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "a_c"} and A:aqs and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 20:47 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "a_c"} and A:aqs and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 20:46 brennen@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.18 refs T354436 (duration: 08m 05s)
  • 20:41 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "rack3"} and A:aqs and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 20:38 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.18 refs T354436
  • 20:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2109 (T352010)', diff saved to https://phabricator.wikimedia.org/P56870 and previous config saved to /var/cache/conftool/dbconfig/20240215-202036-ladsgroup.json
  • 20:20 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 20:20 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 20:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T352010)', diff saved to https://phabricator.wikimedia.org/P56869 and previous config saved to /var/cache/conftool/dbconfig/20240215-202014-ladsgroup.json
  • 20:08 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "rack3"} and A:aqs and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 20:06 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "rack2"} and A:aqs and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 20:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P56868 and previous config saved to /var/cache/conftool/dbconfig/20240215-200507-ladsgroup.json
  • 20:00 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 100%: T355866 - Post migration repool of es2024', diff saved to https://phabricator.wikimedia.org/P56867 and previous config saved to /var/cache/conftool/dbconfig/20240215-200015-arnaudb.json
  • 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 19:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P56866 and previous config saved to /var/cache/conftool/dbconfig/20240215-195001-ladsgroup.json
  • 19:48 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw plugin upgrade - ryankemper@cumin2002 - T356651
  • 19:45 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 75%: T355866 - Post migration repool of es2024', diff saved to https://phabricator.wikimedia.org/P56865 and previous config saved to /var/cache/conftool/dbconfig/20240215-194510-arnaudb.json
  • 19:43 apergos: manually generating checksums in parallel for wikidata full history dumps run, in screen session, owned by ariel, on snapshot1009
  • 19:42 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1006.eqiad.wmnet with reason: host reimage
  • 19:39 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1006.eqiad.wmnet with reason: host reimage
  • 19:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T352010)', diff saved to https://phabricator.wikimedia.org/P56864 and previous config saved to /var/cache/conftool/dbconfig/20240215-193455-ladsgroup.json
  • 19:31 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "rack2"} and A:aqs and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 19:30 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 50%: T355866 - Post migration repool of es2024', diff saved to https://phabricator.wikimedia.org/P56863 and previous config saved to /var/cache/conftool/dbconfig/20240215-193005-arnaudb.json
  • 19:24 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1006.eqiad.wmnet with OS bullseye
  • 19:22 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.18 refs T354436
  • 19:15 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 25%: T355866 - Post migration repool of es2024', diff saved to https://phabricator.wikimedia.org/P56862 and previous config saved to /var/cache/conftool/dbconfig/20240215-191500-arnaudb.json
  • 19:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 100%: T355866 - Post migration repool of db2122', diff saved to https://phabricator.wikimedia.org/P56861 and previous config saved to /var/cache/conftool/dbconfig/20240215-191454-arnaudb.json
  • 19:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P56860 and previous config saved to /var/cache/conftool/dbconfig/20240215-191226-ladsgroup.json
  • 19:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 19:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 19:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56859 and previous config saved to /var/cache/conftool/dbconfig/20240215-191203-ladsgroup.json
  • 19:11 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1006.eqiad.wmnet with OS bullseye
  • 19:04 brennen: train 1.42.0-wmf.18 (T354436): no current blockers, rolling to all wikis.
  • 18:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 75%: T355866 - Post migration repool of db2122', diff saved to https://phabricator.wikimedia.org/P56858 and previous config saved to /var/cache/conftool/dbconfig/20240215-185949-arnaudb.json
  • 18:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246:3314', diff saved to https://phabricator.wikimedia.org/P56857 and previous config saved to /var/cache/conftool/dbconfig/20240215-185657-ladsgroup.json
  • 18:50 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1006.eqiad.wmnet
  • 18:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 50%: T355866 - Post migration repool of db2122', diff saved to https://phabricator.wikimedia.org/P56856 and previous config saved to /var/cache/conftool/dbconfig/20240215-184444-arnaudb.json
  • 18:42 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudservices1006.eqiad.wmnet
  • 18:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246:3314', diff saved to https://phabricator.wikimedia.org/P56855 and previous config saved to /var/cache/conftool/dbconfig/20240215-184150-ladsgroup.json
  • 18:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 25%: T355866 - Post migration repool of db2122', diff saved to https://phabricator.wikimedia.org/P56853 and previous config saved to /var/cache/conftool/dbconfig/20240215-182939-arnaudb.json
  • 18:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: T355866 - Post migration repool of db2105', diff saved to https://phabricator.wikimedia.org/P56852 and previous config saved to /var/cache/conftool/dbconfig/20240215-182934-arnaudb.json
  • 18:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56850 and previous config saved to /var/cache/conftool/dbconfig/20240215-182644-ladsgroup.json
  • 18:23 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1006.eqiad.wmnet with OS bullseye
  • 18:23 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1006
  • 18:21 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
  • 18:21 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1006
  • 18:21 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:20 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: migrate cloudelastic1006 to private IPs - bking@cumin2002"
  • 18:20 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: migrate cloudelastic1006 to private IPs - bking@cumin2002"
  • 18:18 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
  • 18:18 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 18:17 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
  • 18:17 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/toolhub: apply
  • 18:16 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
  • 18:15 bd808@deploy2002: helmfile [staging] START helmfile.d/services/toolhub: apply
  • 18:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: T355866 - Post migration repool of db2105', diff saved to https://phabricator.wikimedia.org/P56849 and previous config saved to /var/cache/conftool/dbconfig/20240215-181429-arnaudb.json
  • 18:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudelastic1006.wikimedia.org
  • 18:12 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:12 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1006.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 18:11 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1006.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 18:09 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 18:02 bking@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic1006.wikimedia.org
  • 17:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 50%: T355866 - Post migration repool of db2105', diff saved to https://phabricator.wikimedia.org/P56848 and previous config saved to /var/cache/conftool/dbconfig/20240215-175924-arnaudb.json
  • 17:54 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1005.eqiad.wmnet
  • 17:48 swfrench-wmf: reenabled puppet on mediawiki::webserver hosts after deploying for T357436
  • 17:47 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudservices1005.eqiad.wmnet
  • 17:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: T355866 - Post migration repool of db2105', diff saved to https://phabricator.wikimedia.org/P56847 and previous config saved to /var/cache/conftool/dbconfig/20240215-174419-arnaudb.json
  • 17:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 100%: T355866 - Post migration repool of db2156', diff saved to https://phabricator.wikimedia.org/P56846 and previous config saved to /var/cache/conftool/dbconfig/20240215-174414-arnaudb.json
  • 17:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: apply
  • 17:38 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: apply
  • 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-wikifunctions: apply
  • 17:37 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-wikifunctions: apply
  • 17:37 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-misc: apply
  • 17:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-misc: apply
  • 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-misc: apply
  • 17:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-misc: apply
  • 17:36 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 17:35 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 17:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 17:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 17:34 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 17:33 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 17:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 17:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 17:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 17:31 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 17:31 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 17:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 75%: T355866 - Post migration repool of db2156', diff saved to https://phabricator.wikimedia.org/P56844 and previous config saved to /var/cache/conftool/dbconfig/20240215-172909-arnaudb.json
  • 17:28 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1005.eqiad.wmnet
  • 17:24 aokoth@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on vrts1002.eqiad.wmnet with reason: Migration Ongoing
  • 17:24 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "rack1"} and A:aqs and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 17:24 aokoth@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on vrts1002.eqiad.wmnet with reason: Migration Ongoing
  • 17:23 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 17:23 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 17:22 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 17:21 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
  • 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 17:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 50%: T355866 - Post migration repool of db2156', diff saved to https://phabricator.wikimedia.org/P56843 and previous config saved to /var/cache/conftool/dbconfig/20240215-171403-arnaudb.json
  • 17:05 swfrench-wmf: disabling puppet shortly on mediawiki::webserver hosts to deploy T357436
  • 16:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 25%: T355866 - Post migration repool of db2156', diff saved to https://phabricator.wikimedia.org/P56842 and previous config saved to /var/cache/conftool/dbconfig/20240215-165858-arnaudb.json
  • 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2155 (re)pooling @ 100%: T355866 - Post migration repool of db2155', diff saved to https://phabricator.wikimedia.org/P56841 and previous config saved to /var/cache/conftool/dbconfig/20240215-165853-arnaudb.json
  • 16:53 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 16:53 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 16:53 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 16:53 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 16:53 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 16:53 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 16:52 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 16:51 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "rack1"} and A:aqs and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 16:46 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@162f72f] (sessionstore): Deploying to updated target list — T353550 (duration: 00m 15s)
  • 16:46 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@162f72f] (sessionstore): Deploying to updated target list — T353550
  • 16:46 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@162f72f] (ml-cache): Deploying to updated target list — T353550 (duration: 00m 15s)
  • 16:46 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw2379.codfw.wmnet with reason: BGP issues - uncordoned, needs investigation
  • 16:45 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@162f72f] (ml-cache): Deploying to updated target list — T353550
  • 16:45 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw2379.codfw.wmnet with reason: BGP issues - uncordoned, needs investigation
  • 16:45 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@162f72f] (cassandra-dev): Deploying to updated target list — T353550 (duration: 00m 15s)
  • 16:45 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@162f72f] (cassandra-dev): Deploying to updated target list — T353550
  • 16:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2155 (re)pooling @ 75%: T355866 - Post migration repool of db2155', diff saved to https://phabricator.wikimedia.org/P56840 and previous config saved to /var/cache/conftool/dbconfig/20240215-164348-arnaudb.json
  • 16:43 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@162f72f] (aqs): Deploying to updated target list — T353550 (duration: 00m 37s)
  • 16:43 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@162f72f] (aqs): Deploying to updated target list — T353550
  • 16:40 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 16:40 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 16:40 hnowlan@cumin2002: conftool action : set/pooled=no; selector: name=mw2379.codfw.wmnet
  • 16:40 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 16:40 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 16:40 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 16:38 dancy@deploy2002: Finished scap: Backport for Load WikimediaCampaignEvents if CampaignEvents is loaded (T347909) (duration: 13m 36s)
  • 16:30 dancy@deploy2002: mhorsey and dancy: Continuing with sync
  • 16:29 hnowlan: kubectl cordon mw2379.codfw.wmnet - bgp issues
  • 16:28 arnaudb@cumin1002: dbctl commit (dc=all): 'db2155 (re)pooling @ 50%: T355866 - Post migration repool of db2155', diff saved to https://phabricator.wikimedia.org/P56839 and previous config saved to /var/cache/conftool/dbconfig/20240215-162843-arnaudb.json
  • 16:26 dancy@deploy2002: mhorsey and dancy: Backport for Load WikimediaCampaignEvents if CampaignEvents is loaded (T347909) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:24 dancy@deploy2002: Started scap: Backport for Load WikimediaCampaignEvents if CampaignEvents is loaded (T347909)
  • 16:16 hnowlan@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2311.codfw.wmnet|mw2335.codfw.wmnet|mw2379.codfw.wmnet|mw2380.codfw.wmnet|mw2383.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 16:14 Daimona: Creating new DB table for the WikimediaCampaignEvents extension in x1.testwiki, x1.test2wiki, x1.officewiki, and x1.wikishared # T347909
  • 16:13 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=(mw2302|mw2303|mw2304|mw2305|mw2306|mw2307|mw2308|mw2309|mw2426).*
  • 16:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2155 (re)pooling @ 25%: T355866 - Post migration repool of db2155', diff saved to https://phabricator.wikimedia.org/P56838 and previous config saved to /var/cache/conftool/dbconfig/20240215-161338-arnaudb.json
  • 16:13 claime: Repooling mw2302|mw2303|mw2304|mw2305|mw2306|mw2307|mw2308|mw2309|mw2426 - T355866
  • 16:13 claime: Uncordoning kubernetes2059.codfw.wmnet kubernetes2028.codfw.wmnet kubernetes2027.codfw.wmnet kubernetes2060.codfw.wmnet kubernetes2008.codfw.wmnet kubernetes2007.codfw.wmnet kubernetes2055.codfw.wmnet mw2301.codfw.wmnet mw2424.codfw.wmnet mw2425.codfw.wmnet mw2427.codfw.wmnet - T355866
  • 16:13 hnowlan@deploy2002: helmfile [eqiad] [canary] DONE helmfile.d/services/mw-jobrunner : sync
  • 16:12 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 16:12 hnowlan@deploy2002: helmfile [eqiad] [canary] START helmfile.d/services/mw-jobrunner : sync
  • 16:12 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 16:00 topranks: commencing move of server uplinks codfw row A6 T355866
  • 15:57 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 38 hosts with reason: Migrating servers in codfw rack A6 to lsw1-a6-codfw
  • 15:56 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 38 hosts with reason: Migrating servers in codfw rack A6 to lsw1-a6-codfw
  • 15:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on es2028.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on es2028.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on es2027.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on es2027.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:49 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a6-codfw.mgmt with reason: prepping for server uplink migration codfw rack a6
  • 15:49 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a6-codfw.mgmt with reason: prepping for server uplink migration codfw rack a6
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on es2024.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on es2024.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2133.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2133.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2122.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2122.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2105.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2105.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2156.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2156.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2155.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2155.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:45 arnaudb@cumin1002: dbctl commit (dc=all): 'T355866 - db2155 db2156 db2105 db2122 db2133 es2024', diff saved to https://phabricator.wikimedia.org/P56837 and previous config saved to /var/cache/conftool/dbconfig/20240215-154520-arnaudb.json
  • 15:24 moritzm: imported openssl11 1.1.1w-0+deb11u1+wmf2 to component/haproxy26 T352744 (with fix for libssl11-dev file contents)
  • 15:15 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=(mw2302|mw2303|mw2304|mw2305|mw2306|mw2307|mw2308|mw2309|mw2426).*
  • 15:15 claime: Depooling mw2302|mw2303|mw2304|mw2305|mw2306|mw2307|mw2308|mw2309|mw2426 - T355866
  • 15:14 claime: Draining kubernetes2059.codfw.wmnet kubernetes2028.codfw.wmnet kubernetes2027.codfw.wmnet kubernetes2060.codfw.wmnet kubernetes2008.codfw.wmnet kubernetes2007.codfw.wmnet kubernetes2055.codfw.wmnet mw2301.codfw.wmnet mw2424.codfw.wmnet mw2425.codfw.wmnet mw2427.codfw.wmnet - T355866
  • 15:12 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:47 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [ruwikiquote] Add 'suppressredirect' right to editors (T357241) (duration: 09m 26s)
  • 14:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
  • 14:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [ruwikiquote] Add 'suppressredirect' right to editors (T357241) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:38 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [ruwikiquote] Add 'suppressredirect' right to editors (T357241)
  • 14:37 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:36 hnowlan: migrating cirrusSearchLinksUpdate to k8s
  • 14:36 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:36 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:35 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:35 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 14:35 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 14:29 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [rowiki] Change autoconfirmed setting (T355990) (duration: 10m 55s)
  • 14:23 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 14:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
  • 14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [rowiki] Change autoconfirmed setting (T355990) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:18 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [rowiki] Change autoconfirmed setting (T355990)
  • 14:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2005.codfw.wmnet with OS bookworm
  • 14:16 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 14:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1246:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56836 and previous config saved to /var/cache/conftool/dbconfig/20240215-140613-ladsgroup.json
  • 14:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1246.eqiad.wmnet with reason: Maintenance
  • 14:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1246.eqiad.wmnet with reason: Maintenance
  • 14:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
  • 14:02 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
  • 13:48 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2005.codfw.wmnet with OS bookworm
  • 13:01 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1031.eqiad.wmnet with OS bookworm
  • 12:34 hnowlan@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host mw2379.codfw.wmnet
  • 12:32 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1031.eqiad.wmnet with reason: host reimage
  • 12:30 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1031.eqiad.wmnet with reason: host reimage
  • 12:21 hnowlan@cumin2002: START - Cookbook sre.hosts.reboot-single for host mw2379.codfw.wmnet
  • 12:17 moritzm: installing Linux 5.10.209 on Bullseye hosts
  • 12:11 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1031.eqiad.wmnet with OS bookworm
  • 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-ui1001.eqiad.wmnet
  • 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-test-ui1001.eqiad.wmnet
  • 11:59 claime: Bumping external traffic to mw-on-k8s to 45% - T357507
  • 11:57 cgoubert@deploy2002: Finished scap: Deploying mw-on-k8s 1003499 1003393 - T349796 T357507 (duration: 00m 50s)
  • 11:56 cgoubert@deploy2002: Started scap: Deploying mw-on-k8s 1003499 1003393 - T349796 T357507
  • 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_test_cluster::hadoop::ui
  • 11:39 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org
  • 11:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_test_cluster::hadoop::ui
  • 11:30 taavi@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org
  • 11:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2105 (T352010)', diff saved to https://phabricator.wikimedia.org/P56834 and previous config saved to /var/cache/conftool/dbconfig/20240215-112535-ladsgroup.json
  • 11:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 11:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 11:18 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw2379.codfw.wmnet
  • 11:10 hnowlan@cumin2002: START - Cookbook sre.hosts.reboot-single for host mw2379.codfw.wmnet
  • 11:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Revert "Include article name in Ploticus error messages" (T357268), Revert "Include article name in Ploticus error messages" (T357268) (duration: 10m 59s)
  • 11:07 zabe: zabe@mwmaint2002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=mkwiki --logwiki=metawiki 'CatCat' 'MonkeyPython' # T357602
  • 11:00 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and matmarex: Continuing with sync
  • 10:58 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and matmarex: Backport for Revert "Include article name in Ploticus error messages" (T357268), Revert "Include article name in Ploticus error messages" (T357268) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:56 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Revert "Include article name in Ploticus error messages" (T357268), Revert "Include article name in Ploticus error messages" (T357268)
  • 10:53 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user="OGPawlis" . # T357605
  • 10:37 hnowlan: running `homer 'cr*codfw*' commit 'T351074'` for new k8s nodes
  • 10:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 10:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 10:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56831 and previous config saved to /var/cache/conftool/dbconfig/20240215-102409-ladsgroup.json
  • 10:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244:3314', diff saved to https://phabricator.wikimedia.org/P56830 and previous config saved to /var/cache/conftool/dbconfig/20240215-100903-ladsgroup.json
  • 09:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244:3314', diff saved to https://phabricator.wikimedia.org/P56829 and previous config saved to /var/cache/conftool/dbconfig/20240215-095356-ladsgroup.json
  • 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host eventlog1003.eqiad.wmnet
  • 09:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host eventlog1003.eqiad.wmnet
  • 09:43 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 09:42 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 09:42 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 09:41 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 09:41 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 09:40 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 09:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: eventlogging::analytics
  • 09:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56827 and previous config saved to /var/cache/conftool/dbconfig/20240215-093850-ladsgroup.json
  • 09:29 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: eventlogging::analytics
  • 08:50 moritzm: rebalance Ganeti codfw/A now that the switch maintenance for A5 and A6 are completed T355864 T355863
  • 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host restbase1036.eqiad.wmnet
  • 08:35 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host restbase1036.eqiad.wmnet
  • 08:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: apifeatureusage::logstash
  • 08:18 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: apifeatureusage::logstash
  • 05:43 kart_: Update cxserver to 2023-12-04-083437-production (T344982, T338432, T351138)
  • 05:40 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:39 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:39 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:38 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 04:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P56823 and previous config saved to /var/cache/conftool/dbconfig/20240215-044554-ladsgroup.json
  • 04:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P56822 and previous config saved to /var/cache/conftool/dbconfig/20240215-043047-ladsgroup.json
  • 04:30 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 04:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 02:31 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1036.eqiad.wmnet with OS bullseye
  • 02:31 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 02:29 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 02:14 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1036.eqiad.wmnet with reason: host reimage
  • 02:11 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1036.eqiad.wmnet with reason: host reimage
  • 01:55 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1036.eqiad.wmnet with OS bullseye
  • 01:46 aokoth@cumin1002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM vrts1002.eqiad.wmnet
  • 01:37 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "d"} and A:restbase and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 00:45 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "d"} and A:restbase and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002

2024-02-14

  • 23:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P56821 and previous config saved to /var/cache/conftool/dbconfig/20240214-235725-ladsgroup.json
  • 23:57 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 23:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 23:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P56820 and previous config saved to /var/cache/conftool/dbconfig/20240214-235703-ladsgroup.json
  • 23:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P56819 and previous config saved to /var/cache/conftool/dbconfig/20240214-234157-ladsgroup.json
  • 23:32 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "c"} and A:restbase and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 23:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P56818 and previous config saved to /var/cache/conftool/dbconfig/20240214-232651-ladsgroup.json
  • 23:14 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new master settings - bking@cumin2002 - T355617
  • 23:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P56817 and previous config saved to /var/cache/conftool/dbconfig/20240214-231144-ladsgroup.json
  • 23:10 eileen: civicrm upgraded from 3ee91f59 to 84ba0ccf
  • 22:51 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new master settings - bking@cumin2002 - T355617
  • 22:50 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1008.eqiad.wmnet
  • 22:50 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1007.eqiad.wmnet
  • 22:49 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1008.eqiad.wmnet
  • 22:49 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1007.eqiad.wmnet
  • 22:48 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new master settings - bking@cumin2002 - T355617
  • 22:39 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "c"} and A:restbase and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 22:33 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new master settings - bking@cumin2002 - T355617
  • 22:20 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic1005*,cloudelastic1006* for IP migration - bking@cumin2002 - T355617
  • 22:20 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1005*,cloudelastic1006* for IP migration - bking@cumin2002 - T355617
  • 22:19 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in cloudelastic
  • 22:19 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in cloudelastic
  • 22:13 urandom: restarting Cassandra: restbase/codfw, row b — T353550
  • 22:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1007.eqiad.wmnet with OS bullseye
  • 22:10 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 22:08 cjming: end of UTC late backport window
  • 22:07 cjming@deploy2002: Finished scap: Backport for throttle.php: Add throttle rule for editathon (T356654) (duration: 08m 31s)
  • 22:00 cjming@deploy2002: zoranzoki21 and cjming: Continuing with sync
  • 22:00 cjming@deploy2002: zoranzoki21 and cjming: Backport for throttle.php: Add throttle rule for editathon (T356654) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:59 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 21:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 21:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T352010)', diff saved to https://phabricator.wikimedia.org/P56816 and previous config saved to /var/cache/conftool/dbconfig/20240214-215934-ladsgroup.json
  • 21:58 cjming@deploy2002: Started scap: Backport for throttle.php: Add throttle rule for editathon (T356654)
  • 21:56 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:56 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:53 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:53 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:52 cjming@deploy2002: Finished scap: Backport for Turn on Parsoid read views by default on wikitech Talk pages (T355374) (duration: 10m 44s)
  • 21:52 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase1032.eqiad.wmnet: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 21:45 cjming@deploy2002: cscott and cjming: Continuing with sync
  • 21:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P56815 and previous config saved to /var/cache/conftool/dbconfig/20240214-214427-ladsgroup.json
  • 21:43 cjming@deploy2002: cscott and cjming: Backport for Turn on Parsoid read views by default on wikitech Talk pages (T355374) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:41 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:41 cjming@deploy2002: Started scap: Backport for Turn on Parsoid read views by default on wikitech Talk pages (T355374)
  • 21:41 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:41 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:41 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:41 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase1032.eqiad.wmnet: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 21:38 cjming@deploy2002: Finished scap: Backport for New communities will not share scripts going forward (T331679), Register dblist (duration: 10m 06s)
  • 21:36 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching P{P:cassandra%rack = "b"} and A:restbase and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 21:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 100%: T355864 - Post migration repool of db2176', diff saved to https://phabricator.wikimedia.org/P56814 and previous config saved to /var/cache/conftool/dbconfig/20240214-213544-arnaudb.json
  • 21:31 cjming@deploy2002: cjming and jdlrobson: Continuing with sync
  • 21:29 cjming@deploy2002: cjming and jdlrobson: Backport for New communities will not share scripts going forward (T331679), Register dblist synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P56813 and previous config saved to /var/cache/conftool/dbconfig/20240214-212920-ladsgroup.json
  • 21:28 cjming@deploy2002: Started scap: Backport for New communities will not share scripts going forward (T331679), Register dblist
  • 21:26 cjming@deploy2002: Sync cancelled.
  • 21:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 75%: T355864 - Post migration repool of db2176', diff saved to https://phabricator.wikimedia.org/P56812 and previous config saved to /var/cache/conftool/dbconfig/20240214-212038-arnaudb.json
  • 21:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T352010)', diff saved to https://phabricator.wikimedia.org/P56811 and previous config saved to /var/cache/conftool/dbconfig/20240214-211413-ladsgroup.json
  • 21:08 cjming@deploy2002: cjming and jdlrobson: Backport for New communities will not share scripts going forward (T331679) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 50%: T355864 - Post migration repool of db2176', diff saved to https://phabricator.wikimedia.org/P56810 and previous config saved to /var/cache/conftool/dbconfig/20240214-210531-arnaudb.json
  • 21:05 cjming@deploy2002: Started scap: Backport for New communities will not share scripts going forward (T331679)
  • 20:59 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1036.eqiad.wmnet with reason: host reimage
  • 20:57 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "b"} and A:restbase and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 20:56 inflatador: bking@pcc-db1001.puppet-diffs.eqiad1.wikimedia.cloud updating puppet facts for PCC
  • 20:56 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1036.eqiad.wmnet with reason: host reimage
  • 20:56 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 20:52 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "a"} and A:restbase and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 20:51 inflatador: bking@puppetmaster1001 manually updating facts data for PCC T355617
  • 20:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 25%: T355864 - Post migration repool of db2176', diff saved to https://phabricator.wikimedia.org/P56808 and previous config saved to /var/cache/conftool/dbconfig/20240214-205027-arnaudb.json
  • 20:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: T355864 - Post migration repool of db2175', diff saved to https://phabricator.wikimedia.org/P56807 and previous config saved to /var/cache/conftool/dbconfig/20240214-205021-arnaudb.json
  • 20:41 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1036.eqiad.wmnet with OS bullseye
  • 20:39 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1007.eqiad.wmnet with reason: host reimage
  • 20:37 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:36 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:36 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1007.eqiad.wmnet with reason: host reimage
  • 20:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 75%: T355864 - Post migration repool of db2175', diff saved to https://phabricator.wikimedia.org/P56806 and previous config saved to /var/cache/conftool/dbconfig/20240214-203517-arnaudb.json
  • 20:34 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase1036.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:31 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:31 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:22 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1007.eqiad.wmnet with OS bullseye
  • 20:21 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1007
  • 20:20 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1007
  • 20:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 50%: T355864 - Post migration repool of db2175', diff saved to https://phabricator.wikimedia.org/P56805 and previous config saved to /var/cache/conftool/dbconfig/20240214-202012-arnaudb.json
  • 20:17 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:16 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: migrate cloudelastic1007 to private IPs - bking@cumin2002"
  • 20:16 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: migrate cloudelastic1007 to private IPs - bking@cumin2002"
  • 20:12 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 20:06 arnaudb@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 25%: T355864 - Post migration repool of db2175', diff saved to https://phabricator.wikimedia.org/P56804 and previous config saved to /var/cache/conftool/dbconfig/20240214-200507-arnaudb.json
  • 20:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 100%: T355864 - Post migration repool of db2154', diff saved to https://phabricator.wikimedia.org/P56803 and previous config saved to /var/cache/conftool/dbconfig/20240214-200501-arnaudb.json
  • 20:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudelastic1007.wikimedia.org
  • 20:05 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1007.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 20:03 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1007.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 19:59 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "a"} and A:restbase and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 19:58 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 19:54 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550 (duration: 00m 05s)
  • 19:54 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550
  • 19:53 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550 (duration: 00m 06s)
  • 19:53 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550
  • 19:53 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550 (duration: 00m 07s)
  • 19:53 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550
  • 19:51 ejegg: payments-wiki upgraded from b699e513 to 29eb0fff
  • 19:51 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550 (duration: 00m 03s)
  • 19:51 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550
  • 19:50 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550 (duration: 00m 04s)
  • 19:50 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550
  • 19:50 bking@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic1007.wikimedia.org
  • 19:50 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550 (duration: 00m 05s)
  • 19:50 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550
  • 19:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 75%: T355864 - Post migration repool of db2154', diff saved to https://phabricator.wikimedia.org/P56802 and previous config saved to /var/cache/conftool/dbconfig/20240214-194956-arnaudb.json
  • 19:46 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550 (duration: 00m 34s)
  • 19:46 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550
  • 19:43 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550 (duration: 00m 14s)
  • 19:43 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550
  • 19:42 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550 (duration: 00m 45s)
  • 19:41 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550
  • 19:39 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550 (duration: 01m 17s)
  • 19:38 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550
  • 19:37 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2383.codfw.wmnet with OS bullseye
  • 19:36 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550 (duration: 00m 16s)
  • 19:36 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@0521449]: Deploying to updated target list — T353550
  • 19:35 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2380.codfw.wmnet with OS bullseye
  • 19:35 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1036.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 50%: T355864 - Post migration repool of db2154', diff saved to https://phabricator.wikimedia.org/P56801 and previous config saved to /var/cache/conftool/dbconfig/20240214-193451-arnaudb.json
  • 19:34 brennen@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.18 refs T354436 (duration: 07m 35s)
  • 19:31 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@5c2dd00]: Deploying to updated target list — T353550 (duration: 00m 20s)
  • 19:30 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@5c2dd00]: Deploying to updated target list — T353550
  • 19:28 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@5c2dd00]: Deploying to updated target list — T353550 (duration: 00m 41s)
  • 19:27 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@5c2dd00]: Deploying to updated target list — T353550
  • 19:26 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.18 refs T354436
  • 19:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2154 (re)pooling @ 25%: T355864 - Post migration repool of db2154', diff saved to https://phabricator.wikimedia.org/P56800 and previous config saved to /var/cache/conftool/dbconfig/20240214-191946-arnaudb.json
  • 19:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 100%: T355864 - Post migration repool of db2153', diff saved to https://phabricator.wikimedia.org/P56799 and previous config saved to /var/cache/conftool/dbconfig/20240214-191941-arnaudb.json
  • 19:14 brennen: train 1.42.0-wmf.18 (T354436): logs chill, no current blockers, rolling to group1.
  • 19:13 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2383.codfw.wmnet with reason: host reimage
  • 19:11 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2380.codfw.wmnet with reason: host reimage
  • 19:09 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2383.codfw.wmnet with reason: host reimage
  • 19:08 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2380.codfw.wmnet with reason: host reimage
  • 19:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 75%: T355864 - Post migration repool of db2153', diff saved to https://phabricator.wikimedia.org/P56798 and previous config saved to /var/cache/conftool/dbconfig/20240214-190436-arnaudb.json
  • 18:58 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2379.codfw.wmnet with OS bullseye
  • 18:54 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host mw2383.codfw.wmnet with OS bullseye
  • 18:53 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host mw2380.codfw.wmnet with OS bullseye
  • 18:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P56797 and previous config saved to /var/cache/conftool/dbconfig/20240214-185218-ladsgroup.json
  • 18:52 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mw2383.codfw.wmnet on all recursors
  • 18:52 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache mw2383.codfw.wmnet on all recursors
  • 18:52 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 18:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 18:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P56796 and previous config saved to /var/cache/conftool/dbconfig/20240214-185207-ladsgroup.json
  • 18:52 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mw2380.codfw.wmnet on all recursors
  • 18:51 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache mw2380.codfw.wmnet on all recursors
  • 18:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 50%: T355864 - Post migration repool of db2153', diff saved to https://phabricator.wikimedia.org/P56795 and previous config saved to /var/cache/conftool/dbconfig/20240214-184931-arnaudb.json
  • 18:48 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:48 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for codfw mw servers - cmooney@cumin1002"
  • 18:47 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for codfw mw servers - cmooney@cumin1002"
  • 18:39 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 18:37 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new master settings - bking@cumin2002 - T355617
  • 18:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P56794 and previous config saved to /var/cache/conftool/dbconfig/20240214-183700-ladsgroup.json
  • 18:34 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2379.codfw.wmnet with reason: host reimage
  • 18:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 25%: T355864 - Post migration repool of db2153', diff saved to https://phabricator.wikimedia.org/P56793 and previous config saved to /var/cache/conftool/dbconfig/20240214-183426-arnaudb.json
  • 18:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2104 (re)pooling @ 100%: T355864 - Post migration repool of db2104', diff saved to https://phabricator.wikimedia.org/P56792 and previous config saved to /var/cache/conftool/dbconfig/20240214-183421-arnaudb.json
  • 18:31 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2379.codfw.wmnet with reason: host reimage
  • 18:24 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirtlocal1003.eqiad.wmnet
  • 18:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P56791 and previous config saved to /var/cache/conftool/dbconfig/20240214-182154-ladsgroup.json
  • 18:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2104 (re)pooling @ 75%: T355864 - Post migration repool of db2104', diff saved to https://phabricator.wikimedia.org/P56790 and previous config saved to /var/cache/conftool/dbconfig/20240214-181916-arnaudb.json
  • 18:18 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudvirtlocal1003.eqiad.wmnet
  • 18:14 hnowlan@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=mw2282.codfw.wmnet
  • 18:12 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host mw2379.codfw.wmnet with OS bullseye
  • 18:11 hnowlan: running `homer 'cr*codfw*' commit 'T351074'` to pick up mw2282's bgp change
  • 18:09 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mw2379.codfw.wmnet on all recursors
  • 18:09 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache mw2379.codfw.wmnet on all recursors
  • 18:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P56789 and previous config saved to /var/cache/conftool/dbconfig/20240214-180647-ladsgroup.json
  • 18:06 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:06 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for mw2379 - cmooney@cumin1002"
  • 18:05 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for mw2379 - cmooney@cumin1002"
  • 18:05 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirtlocal1002.eqiad.wmnet
  • 18:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2104 (re)pooling @ 50%: T355864 - Post migration repool of db2104', diff saved to https://phabricator.wikimedia.org/P56788 and previous config saved to /var/cache/conftool/dbconfig/20240214-180411-arnaudb.json
  • 18:03 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new master settings - bking@cumin2002 - T355617
  • 18:02 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 18:01 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new master settings - bking@cumin2002 - T355617
  • 17:59 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw2282.codfw.wmnet with reason: Testing if reimage is stable T355333
  • 17:59 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw2282.codfw.wmnet with reason: Testing if reimage is stable T355333
  • 17:58 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudvirtlocal1002.eqiad.wmnet
  • 17:56 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2282.codfw.wmnet with OS bullseye
  • 17:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2104 (re)pooling @ 25%: T355864 - Post migration repool of db2104', diff saved to https://phabricator.wikimedia.org/P56787 and previous config saved to /var/cache/conftool/dbconfig/20240214-174906-arnaudb.json
  • 17:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 100%: T355864 - Post migration repool of db2145', diff saved to https://phabricator.wikimedia.org/P56786 and previous config saved to /var/cache/conftool/dbconfig/20240214-174900-arnaudb.json
  • 17:48 ladsgroup@deploy2002: Finished scap: Backport for Enable echo conditional defaults for loginwiki since 2013 (T357072) (duration: 12m 08s)
  • 17:44 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new master settings - bking@cumin2002 - T355617
  • 17:41 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 17:39 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirtlocal1001.eqiad.wmnet
  • 17:39 ladsgroup@deploy2002: ladsgroup: Backport for Enable echo conditional defaults for loginwiki since 2013 (T357072) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:36 ladsgroup@deploy2002: Started scap: Backport for Enable echo conditional defaults for loginwiki since 2013 (T357072)
  • 17:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 75%: T355864 - Post migration repool of db2145', diff saved to https://phabricator.wikimedia.org/P56785 and previous config saved to /var/cache/conftool/dbconfig/20240214-173355-arnaudb.json
  • 17:32 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudvirtlocal1001.eqiad.wmnet
  • 17:32 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2282.codfw.wmnet with reason: host reimage
  • 17:32 fnegri@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet
  • 17:29 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2282.codfw.wmnet with reason: host reimage
  • 17:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 50%: T355864 - Post migration repool of db2145', diff saved to https://phabricator.wikimedia.org/P56784 and previous config saved to /var/cache/conftool/dbconfig/20240214-171850-arnaudb.json
  • 17:13 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 17:10 fabfur: enabled puppet on A:cp-upload to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1003109 selectively (T357479)
  • 17:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 25%: T355864 - Post migration repool of db2145', diff saved to https://phabricator.wikimedia.org/P56783 and previous config saved to /var/cache/conftool/dbconfig/20240214-170345-arnaudb.json
  • 17:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2121 (re)pooling @ 100%: T355864 - Post migration repool of db2121', diff saved to https://phabricator.wikimedia.org/P56782 and previous config saved to /var/cache/conftool/dbconfig/20240214-170339-arnaudb.json
  • 16:56 fabfur: disabled puppet on A:cp-upload to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1003109 selectively (T357479)
  • 16:52 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 16:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2121 (re)pooling @ 75%: T355864 - Post migration repool of db2121', diff saved to https://phabricator.wikimedia.org/P56781 and previous config saved to /var/cache/conftool/dbconfig/20240214-164834-arnaudb.json
  • 16:48 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 16:37 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2005.codfw.wmnet with OS bookworm
  • 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2121 (re)pooling @ 50%: T355864 - Post migration repool of db2121', diff saved to https://phabricator.wikimedia.org/P56780 and previous config saved to /var/cache/conftool/dbconfig/20240214-163330-arnaudb.json
  • 16:20 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 16:19 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2005.codfw.wmnet
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2121 (re)pooling @ 25%: T355864 - Post migration repool of db2121', diff saved to https://phabricator.wikimedia.org/P56779 and previous config saved to /var/cache/conftool/dbconfig/20240214-161824-arnaudb.json
  • 16:17 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=(mw2402|mw2403|mw2404|mw2405|mw2407|mw2408|mw2409|mw2401|mw2410|mw2411|parse2001|parse2002|parse2003).*
  • 16:16 claime: Repooling mw2402|mw2403|mw2404|mw2405|mw2407|mw2408|mw2409|mw2401|mw2410|mw2411|parse2001|parse2002|parse2003 for T355864
  • 16:16 claime: Uncordoning kubernetes2019.codfw.wmnet kubernetes2018.codfw.wmnet mw2420.codfw.wmnet mw2421.codfw.wmnet mw2406.codfw.wmnet mw2422.codfw.wmnet mw2423.codfw.wmnet for T355864
  • 16:07 topranks: Moving server uplinks from old switch to new codfw rack A5 T355864
  • 16:07 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on sretest2005.codfw.wmnet with reason: host reimage
  • 16:07 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2005.codfw.wmnet with reason: host reimage
  • 16:07 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 38 hosts with reason: Migrating servers in codfw rack A5 to lsw1-a5-codfw
  • 16:06 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 38 hosts with reason: Migrating servers in codfw rack A5 to lsw1-a5-codfw
  • 16:04 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new master settings - bking@cumin2002 - T355617
  • 15:59 topranks: disable puppet fleet-wide to allow for distruption to puppetmaster/puppetserver during network maint T355864
  • 15:59 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new master settings - bking@cumin2002 - T355617
  • 15:55 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new master settings - bking@cumin2002 - T355617
  • 15:54 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 15:53 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 15:53 hnowlan@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2282.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 15:53 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new master settings - bking@cumin2002 - T355617
  • 15:51 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a5-codfw.mgmt with reason: prepping for server uplink migration codfw rack a5
  • 15:50 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a5-codfw.mgmt with reason: prepping for server uplink migration codfw rack a5
  • 15:47 arnaudb@cumin1002: dbctl commit (dc=all): 'T355864 - Depool db2121 db2132 db2145 db2104 db2153 db2154 db2175 db2176', diff saved to https://phabricator.wikimedia.org/P56778 and previous config saved to /var/cache/conftool/dbconfig/20240214-154753-arnaudb.json
  • 15:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2176.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:47 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2176.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2175.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2175.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2154.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2154.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2153.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2153.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2104.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2104.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2145.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2145.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2132.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2132.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2121.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2121.codfw.wmnet with reason: T355864 - Migrate servers in codfw rack A5 from asw-a5-codfw to lsw1-a5-codfw
  • 15:44 hnowlan@cumin2002: START - Cookbook sre.hosts.provision for host mw2282.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 15:44 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2005.codfw.wmnet
  • 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host restbase1042.eqiad.wmnet
  • 15:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host restbase1042.eqiad.wmnet
  • 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host restbase1041.eqiad.wmnet
  • 15:21 hnowlan@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2380.codfw.wmnet with OS bullseye
  • 15:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host restbase1041.eqiad.wmnet
  • 15:14 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2380.codfw.wmnet with OS bullseye
  • 15:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host restbase1040.eqiad.wmnet
  • 15:11 hnowlan@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2379.codfw.wmnet with OS bullseye
  • 14:56 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 14:52 hnowlan@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2380.codfw.wmnet with OS bullseye
  • 14:52 hnowlan@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2379.codfw.wmnet with OS bullseye
  • 14:52 hnowlan@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2383.codfw.wmnet with OS bullseye
  • 14:51 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host restbase1038.eqiad.wmnet
  • 14:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host restbase1037.eqiad.wmnet
  • 14:47 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2311.codfw.wmnet with reason: host reimage
  • 14:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2005.codfw.wmnet with OS bookworm
  • 14:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1223 (T352010)', diff saved to https://phabricator.wikimedia.org/P56777 and previous config saved to /var/cache/conftool/dbconfig/20240214-144537-ladsgroup.json
  • 14:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1223.eqiad.wmnet with reason: Maintenance
  • 14:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1223.eqiad.wmnet with reason: Maintenance
  • 14:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T352010)', diff saved to https://phabricator.wikimedia.org/P56776 and previous config saved to /var/cache/conftool/dbconfig/20240214-144514-ladsgroup.json
  • 14:45 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2383.codfw.wmnet with OS bullseye
  • 14:44 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2379.codfw.wmnet with OS bullseye
  • 14:44 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2335.codfw.wmnet with reason: host reimage
  • 14:44 claime: Restarted rsyslog on A:wikikube-master
  • 14:44 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2380.codfw.wmnet with OS bullseye
  • 14:43 hnowlan@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2380.codfw.wmnet with OS bullseye
  • 14:43 hnowlan@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2379.codfw.wmnet with OS bullseye
  • 14:42 hnowlan@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2383.codfw.wmnet with OS bullseye
  • 14:42 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2311.codfw.wmnet with reason: host reimage
  • 14:41 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2335.codfw.wmnet with reason: host reimage
  • 14:40 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host restbase1037.eqiad.wmnet
  • 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host restbase1035.eqiad.wmnet
  • 14:35 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=(mw2402|mw2403|mw2404|mw2405|mw2407|mw2408|mw2409|mw2401|mw2410|mw2411|parse2001|parse2002|parse2003).*
  • 14:34 claime: Depooling mw2402|mw2403|mw2404|mw2405|mw2407|mw2408|mw2409|mw2401|mw2410|mw2411|parse2001|parse2002|parse2003 for T355864
  • 14:33 TheresNoTime: close UTC afternoon backport window
  • 14:32 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:31 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host restbase1035.eqiad.wmnet
  • 14:31 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2005.codfw.wmnet with reason: host reimage
  • 14:31 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 14:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host restbase1034.eqiad.wmnet
  • 14:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P56774 and previous config saved to /var/cache/conftool/dbconfig/20240214-143006-ladsgroup.json
  • 14:29 samtar@deploy2002: Finished scap: Backport for prod: Stop setting $wgCampaignEventsEnableParticipantQuestions (T347608) (duration: 23m 37s)
  • 14:27 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2005.codfw.wmnet with reason: host reimage
  • 14:26 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2335.codfw.wmnet with OS bullseye
  • 14:26 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2383.codfw.wmnet with OS bullseye
  • 14:26 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2380.codfw.wmnet with OS bullseye
  • 14:26 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2379.codfw.wmnet with OS bullseye
  • 14:25 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2311.codfw.wmnet with OS bullseye
  • 14:22 samtar@deploy2002: samtar and daimona: Continuing with sync
  • 14:15 claime: Draining and cordoning kubernetes2019.codfw.wmnet kubernetes2018.codfw.wmnet mw2420.codfw.wmnet mw2421.codfw.wmnet mw2406.codfw.wmnet mw2422.codfw.wmnet mw2423.codfw.wmnet for T355864
  • 14:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P56773 and previous config saved to /var/cache/conftool/dbconfig/20240214-141459-ladsgroup.json
  • 14:14 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 14:10 samtar@deploy2002: samtar and daimona: Backport for prod: Stop setting $wgCampaignEventsEnableParticipantQuestions (T347608) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:09 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host restbase1034.eqiad.wmnet
  • 14:06 samtar@deploy2002: Started scap: Backport for prod: Stop setting $wgCampaignEventsEnableParticipantQuestions (T347608)
  • 14:05 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 14:03 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T352010)', diff saved to https://phabricator.wikimedia.org/P56772 and previous config saved to /var/cache/conftool/dbconfig/20240214-135953-ladsgroup.json
  • 13:59 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host eventlog1003.eqiad.wmnet with OS bullseye
  • 13:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P56771 and previous config saved to /var/cache/conftool/dbconfig/20240214-135813-ladsgroup.json
  • 13:58 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 13:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 13:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P56770 and previous config saved to /var/cache/conftool/dbconfig/20240214-135750-ladsgroup.json
  • 13:52 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T352010)', diff saved to https://phabricator.wikimedia.org/P56769 and previous config saved to /var/cache/conftool/dbconfig/20240214-134959-ladsgroup.json
  • 13:49 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:49 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:49 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 13:49 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 13:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T352010)', diff saved to https://phabricator.wikimedia.org/P56768 and previous config saved to /var/cache/conftool/dbconfig/20240214-134929-ladsgroup.json
  • 13:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P56767 and previous config saved to /var/cache/conftool/dbconfig/20240214-134244-ladsgroup.json
  • 13:42 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host apifeatureusage2001.codfw.wmnet
  • 13:39 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on eventlog1003.eqiad.wmnet with reason: host reimage
  • 13:36 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on eventlog1003.eqiad.wmnet with reason: host reimage
  • 13:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P56766 and previous config saved to /var/cache/conftool/dbconfig/20240214-133422-ladsgroup.json
  • 13:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P56765 and previous config saved to /var/cache/conftool/dbconfig/20240214-132737-ladsgroup.json
  • 13:26 Daimona: T357007 Profiling current master version of CampaignEvents:GenerateInvitationList with excimer in mwmaint2002
  • 13:24 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host apifeatureusage2001.codfw.wmnet
  • 13:24 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2005.codfw.wmnet with OS bookworm
  • 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetserver2003.codfw.wmnet with OS bookworm
  • 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
  • 13:24 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host eventlog1003.eqiad.wmnet with OS bullseye
  • 13:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P56764 and previous config saved to /var/cache/conftool/dbconfig/20240214-131916-ladsgroup.json
  • 13:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P56763 and previous config saved to /var/cache/conftool/dbconfig/20240214-131231-ladsgroup.json
  • 13:10 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2005.codfw.wmnet with reason: host reimage
  • 13:07 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2005.codfw.wmnet with reason: host reimage
  • 13:05 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T352010)', diff saved to https://phabricator.wikimedia.org/P56762 and previous config saved to /var/cache/conftool/dbconfig/20240214-130410-ladsgroup.json
  • 13:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T352010)', diff saved to https://phabricator.wikimedia.org/P56761 and previous config saved to /var/cache/conftool/dbconfig/20240214-130157-ladsgroup.json
  • 13:02 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 13:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 13:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T352010)', diff saved to https://phabricator.wikimedia.org/P56760 and previous config saved to /var/cache/conftool/dbconfig/20240214-130134-ladsgroup.json
  • 12:52 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 12:49 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1045
  • 12:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P56758 and previous config saved to /var/cache/conftool/dbconfig/20240214-124627-ladsgroup.json
  • 12:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P56757 and previous config saved to /var/cache/conftool/dbconfig/20240214-123120-ladsgroup.json
  • 12:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T352010)', diff saved to https://phabricator.wikimedia.org/P56756 and previous config saved to /var/cache/conftool/dbconfig/20240214-121614-ladsgroup.json
  • 12:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T352010)', diff saved to https://phabricator.wikimedia.org/P56755 and previous config saved to /var/cache/conftool/dbconfig/20240214-121401-ladsgroup.json
  • 12:13 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 12:13 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 12:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T352010)', diff saved to https://phabricator.wikimedia.org/P56754 and previous config saved to /var/cache/conftool/dbconfig/20240214-121337-ladsgroup.json
  • 12:11 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
  • 12:02 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
  • 12:02 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
  • 12:00 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
  • 11:59 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
  • 11:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P56753 and previous config saved to /var/cache/conftool/dbconfig/20240214-115831-ladsgroup.json
  • 11:58 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
  • 11:58 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
  • 11:57 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1045
  • 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetserver2003.codfw.wmnet with reason: host reimage
  • 11:51 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetserver2003.codfw.wmnet with reason: host reimage
  • 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be1045.eqiad.wmnet
  • 11:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P56752 and previous config saved to /var/cache/conftool/dbconfig/20240214-114325-ladsgroup.json
  • 11:40 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be1045.eqiad.wmnet
  • 11:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host puppetserver2003.codfw.wmnet with OS bookworm
  • 11:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T352010)', diff saved to https://phabricator.wikimedia.org/P56751 and previous config saved to /var/cache/conftool/dbconfig/20240214-112818-ladsgroup.json
  • 11:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T352010)', diff saved to https://phabricator.wikimedia.org/P56750 and previous config saved to /var/cache/conftool/dbconfig/20240214-112606-ladsgroup.json
  • 11:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 11:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 11:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T352010)', diff saved to https://phabricator.wikimedia.org/P56749 and previous config saved to /var/cache/conftool/dbconfig/20240214-112543-ladsgroup.json
  • 11:15 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
  • 11:14 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
  • 11:14 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
  • 11:14 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
  • 11:14 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
  • 11:14 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
  • 11:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P56748 and previous config saved to /var/cache/conftool/dbconfig/20240214-111037-ladsgroup.json
  • 11:06 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host puppetserver2003.codfw.wmnet with OS bookworm
  • 10:58 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
  • 10:58 aokoth@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM vrts1002.eqiad.wmnet
  • 10:58 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
  • 10:58 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
  • 10:57 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
  • 10:57 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 10:56 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 10:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P56747 and previous config saved to /var/cache/conftool/dbconfig/20240214-105530-ladsgroup.json
  • 10:52 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
  • 10:48 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 10:48 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 10:48 jelto: import prometheus-etherpad-exporter 0.7 to bookworm-wikimedia on apt hosts - T316421
  • 10:46 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2005.codfw.wmnet
  • 10:45 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2005.codfw.wmnet
  • 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host puppetserver2003.codfw.wmnet with OS bookworm
  • 10:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T352010)', diff saved to https://phabricator.wikimedia.org/P56746 and previous config saved to /var/cache/conftool/dbconfig/20240214-104024-ladsgroup.json
  • 10:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T352010)', diff saved to https://phabricator.wikimedia.org/P56745 and previous config saved to /var/cache/conftool/dbconfig/20240214-103810-ladsgroup.json
  • 10:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 10:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 10:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 10:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 10:37 slyngs: Deploying new PKI checks to alertmanager
  • 10:33 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 10:33 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 10:31 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host puppetserver2003.codfw.wmnet with OS bookworm
  • 10:28 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 10:28 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 10:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host puppetserver2003.codfw.wmnet with OS bookworm
  • 10:18 godog: powercycle titan1001
  • 10:02 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 09:55 moritzm: installing Linux 5.10.209 on Bullseye hosts
  • 09:49 moritzm: imported openssl11 1.1.1w-0+deb11u1+wmf1 to component/haproxy26 T352744
  • 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2024.codfw.wmnet
  • 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2024.codfw.wmnet
  • 09:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2023.codfw.wmnet
  • 09:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2023.codfw.wmnet
  • 08:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P56744 and previous config saved to /var/cache/conftool/dbconfig/20240214-084146-ladsgroup.json
  • 08:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:41 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 08:41 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 08:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P56743 and previous config saved to /var/cache/conftool/dbconfig/20240214-084104-ladsgroup.json
  • 08:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P56742 and previous config saved to /var/cache/conftool/dbconfig/20240214-082558-ladsgroup.json
  • 08:20 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org
  • 08:12 taavi: restart apache2 on lists1001 to remove traces of old, soon-to-expire TLS certificate
  • 08:11 taavi@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org
  • 08:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P56741 and previous config saved to /var/cache/conftool/dbconfig/20240214-081051-ladsgroup.json
  • 07:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P56740 and previous config saved to /var/cache/conftool/dbconfig/20240214-075545-ladsgroup.json
  • 07:51 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 07:50 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 07:50 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 07:49 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 07:48 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 07:48 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 07:48 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 07:47 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 06:22 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1035.eqiad.wmnet with OS bullseye
  • 06:22 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 03:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P56739 and previous config saved to /var/cache/conftool/dbconfig/20240214-031125-ladsgroup.json
  • 03:11 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 03:11 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 03:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P56738 and previous config saved to /var/cache/conftool/dbconfig/20240214-031103-ladsgroup.json
  • 02:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P56737 and previous config saved to /var/cache/conftool/dbconfig/20240214-025557-ladsgroup.json
  • 02:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P56736 and previous config saved to /var/cache/conftool/dbconfig/20240214-024050-ladsgroup.json
  • 02:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P56735 and previous config saved to /var/cache/conftool/dbconfig/20240214-022544-ladsgroup.json
  • 01:44 eileen: civicrm upgraded from 497e0899 to 3ee91f59
  • 00:04 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncmonitor1001.eqiad.wmnet with reason: host reimage
  • 00:01 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncmonitor1001.eqiad.wmnet with reason: host reimage

2024-02-13

  • 23:55 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host ncmonitor1001.eqiad.wmnet with OS bookworm
  • 21:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P56734 and previous config saved to /var/cache/conftool/dbconfig/20240213-212343-ladsgroup.json
  • 21:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 21:23 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 21:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T352010)', diff saved to https://phabricator.wikimedia.org/P56733 and previous config saved to /var/cache/conftool/dbconfig/20240213-212321-ladsgroup.json
  • 21:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P56732 and previous config saved to /var/cache/conftool/dbconfig/20240213-210814-ladsgroup.json
  • 21:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T352010)', diff saved to https://phabricator.wikimedia.org/P56731 and previous config saved to /var/cache/conftool/dbconfig/20240213-210813-ladsgroup.json
  • 20:59 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw
  • 20:59 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw
  • 20:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P56730 and previous config saved to /var/cache/conftool/dbconfig/20240213-205308-ladsgroup.json
  • 20:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P56729 and previous config saved to /var/cache/conftool/dbconfig/20240213-205307-ladsgroup.json
  • 20:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P56728 and previous config saved to /var/cache/conftool/dbconfig/20240213-203800-ladsgroup.json
  • 20:23 mutante: phab1004 - running public_task_dump.py T355502
  • 20:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T352010)', diff saved to https://phabricator.wikimedia.org/P56727 and previous config saved to /var/cache/conftool/dbconfig/20240213-202254-ladsgroup.json
  • 20:22 brennen@deploy2002: Finished deploy [phabricator/deployment@f4a7f50]: deploy to phab1004 for T357464 (duration: 00m 48s)
  • 20:21 brennen@deploy2002: Started deploy [phabricator/deployment@f4a7f50]: deploy to phab1004 for T357464
  • 20:20 brennen@deploy2002: Finished deploy [phabricator/deployment@f4a7f50]: test deploy to phab2002 for T357464 (duration: 00m 29s)
  • 20:20 brennen@deploy2002: Started deploy [phabricator/deployment@f4a7f50]: test deploy to phab2002 for T357464
  • 20:08 eileen: civicrm upgraded from ac69725f to 497e0899
  • 19:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T352010)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240213-195724-ladsgroup.json
  • 19:57 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 19:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 19:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T352010)', diff saved to https://phabricator.wikimedia.org/P56725 and previous config saved to /var/cache/conftool/dbconfig/20240213-195701-ladsgroup.json
  • 19:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P56724 and previous config saved to /var/cache/conftool/dbconfig/20240213-194155-ladsgroup.json
  • 19:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P56723 and previous config saved to /var/cache/conftool/dbconfig/20240213-192648-ladsgroup.json
  • 19:11 brennen@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.18 refs T354436
  • 19:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T352010)', diff saved to https://phabricator.wikimedia.org/P56722 and previous config saved to /var/cache/conftool/dbconfig/20240213-191142-ladsgroup.json
  • 19:01 brennen: train 1.42.0-wmf.18 (T354436): no current blockers, rolling to group0.
  • 18:43 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-test-ui1001.eqiad.wmnet with OS bullseye
  • 18:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T352010)', diff saved to https://phabricator.wikimedia.org/P56721 and previous config saved to /var/cache/conftool/dbconfig/20240213-184159-ladsgroup.json
  • 18:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 18:41 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 18:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T352010)', diff saved to https://phabricator.wikimedia.org/P56720 and previous config saved to /var/cache/conftool/dbconfig/20240213-184137-ladsgroup.json
  • 18:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P56718 and previous config saved to /var/cache/conftool/dbconfig/20240213-182630-ladsgroup.json
  • 18:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P56717 and previous config saved to /var/cache/conftool/dbconfig/20240213-181124-ladsgroup.json
  • 18:01 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 18:00 ladsgroup@deploy2002: Finished scap: Backport for ruwiki: Add 'edituserjson' right to 'engineers' group (T355499) (duration: 08m 28s)
  • 17:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T352010)', diff saved to https://phabricator.wikimedia.org/P56716 and previous config saved to /var/cache/conftool/dbconfig/20240213-175617-ladsgroup.json
  • 17:53 ladsgroup@deploy2002: ammarpad and ladsgroup: Continuing with sync
  • 17:53 ladsgroup@deploy2002: ammarpad and ladsgroup: Backport for ruwiki: Add 'edituserjson' right to 'engineers' group (T355499) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:52 ladsgroup@deploy2002: Started scap: Backport for ruwiki: Add 'edituserjson' right to 'engineers' group (T355499)
  • 17:49 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Use EditEntity for ItemMergeInteractor (T356149 T356764), Use EditEntity for MergeLexemesInteractor (T356149 T356764), Use EditEntity for ItemMergeInteractor (T356149 T356764), Use EditEntity for MergeLexemesInteractor (T356149 T356764) (duration: 10m 11s)
  • 17:46 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1035.eqiad.wmnet with reason: host reimage
  • 17:43 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-ui1001.eqiad.wmnet with reason: host reimage
  • 17:42 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
  • 17:41 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1035.eqiad.wmnet with reason: host reimage
  • 17:41 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-ui1001.eqiad.wmnet with reason: host reimage
  • 17:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for Use EditEntity for ItemMergeInteractor (T356149 T356764), Use EditEntity for MergeLexemesInteractor (T356149 T356764), Use EditEntity for ItemMergeInteractor (T356149 T356764), Use EditEntity for MergeLexemesInteractor (T356149 T356764) synced to the testservers (https://wik
  • 17:39 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Use EditEntity for ItemMergeInteractor (T356149 T356764), Use EditEntity for MergeLexemesInteractor (T356149 T356764), Use EditEntity for ItemMergeInteractor (T356149 T356764), Use EditEntity for MergeLexemesInteractor (T356149 T356764)
  • 17:37 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp[2027-2028].codfw.wmnet
  • 17:37 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp[2027-2028].codfw.wmnet
  • 17:36 brett@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,name=cp202(7|8).codfw.wmnet
  • 17:29 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-test-ui1001.eqiad.wmnet with OS bullseye
  • 17:26 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1035.eqiad.wmnet with OS bullseye
  • 17:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T352010)', diff saved to https://phabricator.wikimedia.org/P56715 and previous config saved to /var/cache/conftool/dbconfig/20240213-172620-ladsgroup.json
  • 17:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 17:26 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host restbase1035.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:26 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 17:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 17:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 17:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P56714 and previous config saved to /var/cache/conftool/dbconfig/20240213-172542-ladsgroup.json
  • 17:25 brennen@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.18 refs T354436 (duration: 24m 39s)
  • 17:23 sukhe: running authdns-update to lower dyna TTLs: T140365
  • 17:19 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1035.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P56713 and previous config saved to /var/cache/conftool/dbconfig/20240213-171034-ladsgroup.json
  • 17:04 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
  • 17:00 brennen@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.18 refs T354436
  • 16:55 volans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on sretest1001.eqiad.wmnet with reason: training
  • 16:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P56712 and previous config saved to /var/cache/conftool/dbconfig/20240213-165527-ladsgroup.json
  • 16:55 volans@cumin1002: START - Cookbook sre.hosts.downtime for 0:05:00 on sretest1001.eqiad.wmnet with reason: training
  • 16:49 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ncmonitor1001.eqiad.wmnet with OS bookworm
  • 16:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1146.eqiad.wmnet
  • 16:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1146.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 16:41 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1146.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 16:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P56709 and previous config saved to /var/cache/conftool/dbconfig/20240213-164021-ladsgroup.json
  • 16:39 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 16:36 sukhe: running authdns-update for CR 1003017: T346394
  • 16:34 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1146.eqiad.wmnet
  • 16:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1145.eqiad.wmnet
  • 16:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1145.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 16:32 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1145.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 16:30 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 16:24 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1145.eqiad.wmnet
  • 16:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1144.eqiad.wmnet
  • 16:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1144.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 16:23 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1144.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 16:21 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 16:14 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1144.eqiad.wmnet
  • 16:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1140.eqiad.wmnet
  • 16:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1140.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 16:12 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1140.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 16:11 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 16:10 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 16:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P56707 and previous config saved to /var/cache/conftool/dbconfig/20240213-160826-ladsgroup.json
  • 16:08 topranks: moving codfw rack a4 server links T355863
  • 16:08 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 16:08 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 16:05 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 23 hosts with reason: Migrating servers in codfw rack A4 to lsw1-a4-codfw
  • 16:05 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 23 hosts with reason: Migrating servers in codfw rack A4 to lsw1-a4-codfw
  • 16:05 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1140.eqiad.wmnet
  • 16:05 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host ncmonitor1001.eqiad.wmnet with OS bookworm
  • 16:04 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 23 hosts
  • 16:04 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for 23 hosts
  • 16:04 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for asw-a-codfw,cr[1-2]-codfw,lsw1-a4-codfw.mgmt
  • 16:04 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for asw-a-codfw,cr[1-2]-codfw,lsw1-a4-codfw.mgmt
  • 16:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1139.eqiad.wmnet
  • 16:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1139.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 16:02 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1139.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 15:59 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 15:57 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
  • 15:56 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/termbox: apply
  • 15:55 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/termbox: apply
  • 15:54 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1139.eqiad.wmnet
  • 15:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1135.eqiad.wmnet
  • 15:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1135.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 15:53 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
  • 15:52 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1135.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 15:52 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/termbox: apply
  • 15:50 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
  • 15:50 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
  • 15:50 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 15:44 topranks: moving netbox links and pre-configuring lsw1-a4-codfw for servers before network move T355863
  • 15:44 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1135.eqiad.wmnet
  • 15:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1133.eqiad.wmnet
  • 15:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1133.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 15:42 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1133.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 15:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 15:41 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 15:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T352010)', diff saved to https://phabricator.wikimedia.org/P56704 and previous config saved to /var/cache/conftool/dbconfig/20240213-154100-ladsgroup.json
  • 15:40 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ncmonitor1001.eqiad.wmnet with OS bookworm
  • 15:39 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 15:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1160 (T352010)', diff saved to https://phabricator.wikimedia.org/P56703 and previous config saved to /var/cache/conftool/dbconfig/20240213-153720-ladsgroup.json
  • 15:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 15:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 15:34 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1133.eqiad.wmnet
  • 15:32 cgoubert@deploy2002: Finished scap: mw-on-k8s: Raise the number of canary replicas - T357402 (duration: 02m 58s)
  • 15:30 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic[2061-2062,2089].codfw.wmnet with reason: T355863
  • 15:30 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic[2061-2062,2089].codfw.wmnet with reason: T355863
  • 15:29 cgoubert@deploy2002: Started scap: mw-on-k8s: Raise the number of canary replicas - T357402
  • 15:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1149.eqiad.wmnet
  • 15:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1149.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 15:26 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncmonitor1001.eqiad.wmnet with OS bookworm
  • 15:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P56702 and previous config saved to /var/cache/conftool/dbconfig/20240213-152554-ladsgroup.json
  • 15:21 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1431.eqiad.wmnet|mw1430.eqiad.wmnet|mw1434.eqiad.wmnet|mw1453.eqiad.wmnet|mw1385.eqiad.wmnet),cluster=kubernetes,service=kubesvc
  • 15:20 stevemunene@deploy2002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
  • 15:20 arnaudb@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1149.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
  • 15:20 stevemunene@deploy2002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
  • 15:19 stevemunene@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
  • 15:19 stevemunene@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
  • 15:18 stevemunene@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 15:17 stevemunene@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 15:16 stevemunene@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 15:16 arnaudb@cumin1002: START - Cookbook sre.dns.netbox
  • 15:15 stevemunene@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 15:15 stevemunene@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 15:14 stevemunene@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 15:14 hnowlan: running `homer 'cr*eqiad*' commit 'T351074'
  • 15:14 stevemunene@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 15:13 stevemunene@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 15:11 arnaudb@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1149.eqiad.wmnet
  • 15:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P56701 and previous config saved to /var/cache/conftool/dbconfig/20240213-151047-ladsgroup.json
  • 14:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T352010)', diff saved to https://phabricator.wikimedia.org/P56700 and previous config saved to /var/cache/conftool/dbconfig/20240213-145541-ladsgroup.json
  • 14:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp[2027-2028].codfw.wmnet with reason: T355863
  • 14:47 brett@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on cp[2027-2028].codfw.wmnet with reason: T355863
  • 14:46 brett@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,name=cp202(7|8).codfw.wmnet
  • 14:36 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts sessionstore[2001-2003].codfw.wmnet
  • 14:36 eevans@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:36 eevans@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sessionstore[2001-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1002"
  • 14:35 eevans@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sessionstore[2001-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1002"
  • 14:33 moritzm: imported openssl 1.1.1w-0+deb11u1+wmf1 to component/haproxy26 T352744
  • 14:30 eevans@cumin1002: START - Cookbook sre.dns.netbox
  • 14:26 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic2061*,elastic2062*,elastic2089* for switch maintenance - bking@cumin2002 - T355863
  • 14:26 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic2061*,elastic2062*,elastic2089* for switch maintenance - bking@cumin2002 - T355863
  • 14:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2109 (T352010)', diff saved to https://phabricator.wikimedia.org/P56699 and previous config saved to /var/cache/conftool/dbconfig/20240213-142250-ladsgroup.json
  • 14:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 14:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 14:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T352010)', diff saved to https://phabricator.wikimedia.org/P56698 and previous config saved to /var/cache/conftool/dbconfig/20240213-142228-ladsgroup.json
  • 14:20 eevans@cumin1002: START - Cookbook sre.hosts.decommission for hosts sessionstore[2001-2003].codfw.wmnet
  • 14:18 godog: bounce puppetserver on puppetserver1003 to test noop config change - T352640
  • 14:11 jelto: import etherpad-lite 1.9.7-2 on apt host into bookworm-wikimedia - T316421
  • 14:08 effie: restarting envoy on baremetal mediawiki api servers
  • 14:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P56697 and previous config saved to /var/cache/conftool/dbconfig/20240213-140722-ladsgroup.json
  • 13:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P56696 and previous config saved to /var/cache/conftool/dbconfig/20240213-135215-ladsgroup.json
  • 13:48 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1005.eqiad.wmnet
  • 13:45 hashar@deploy2002: Finished deploy [gerrit/gerrit@737c475]: wm-checks-api: Gerrit 3.8 no more sets redundant real_author (duration: 00m 07s)
  • 13:45 hashar@deploy2002: Started deploy [gerrit/gerrit@737c475]: wm-checks-api: Gerrit 3.8 no more sets redundant real_author
  • 13:42 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
  • 13:40 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
  • 13:40 taavi@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1005.eqiad.wmnet
  • 13:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T352010)', diff saved to https://phabricator.wikimedia.org/P56695 and previous config saved to /var/cache/conftool/dbconfig/20240213-133709-ladsgroup.json
  • 13:33 hashar@deploy2002: Finished deploy [gerrit/gerrit@7dd9a27]: Support Gerrit 3.8 CSS styling API - T354886 (duration: 00m 07s)
  • 13:33 hashar@deploy2002: Started deploy [gerrit/gerrit@7dd9a27]: Support Gerrit 3.8 CSS styling API - T354886
  • 13:31 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1430.eqiad.wmnet with OS bullseye
  • 13:28 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1453.eqiad.wmnet with OS bullseye
  • 13:28 hashar@deploy2002: Finished deploy [gerrit/gerrit@b02c97e]: Let Gerrit manage light/dark theme (duration: 00m 07s)
  • 13:28 hashar@deploy2002: Started deploy [gerrit/gerrit@b02c97e]: Let Gerrit manage light/dark theme
  • 13:26 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1434.eqiad.wmnet with OS bullseye
  • 13:24 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1385.eqiad.wmnet with OS bullseye
  • 13:22 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
  • 13:21 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1431.eqiad.wmnet with OS bullseye
  • 13:13 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1430.eqiad.wmnet with reason: host reimage
  • 13:12 taavi@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
  • 13:11 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 13:11 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 13:11 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 13:11 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 13:10 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 13:10 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1453.eqiad.wmnet with reason: host reimage
  • 13:10 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 13:08 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1434.eqiad.wmnet with reason: host reimage
  • 13:06 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1385.eqiad.wmnet with reason: host reimage
  • 13:04 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
  • 13:03 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1431.eqiad.wmnet with reason: host reimage
  • 13:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2105 (T352010)', diff saved to https://phabricator.wikimedia.org/P56694 and previous config saved to /var/cache/conftool/dbconfig/20240213-130316-ladsgroup.json
  • 13:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 13:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 13:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T352010)', diff saved to https://phabricator.wikimedia.org/P56693 and previous config saved to /var/cache/conftool/dbconfig/20240213-130255-ladsgroup.json
  • 13:02 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1385.eqiad.wmnet with reason: host reimage
  • 13:01 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1453.eqiad.wmnet with reason: host reimage
  • 13:01 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1430.eqiad.wmnet with reason: host reimage
  • 13:01 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1434.eqiad.wmnet with reason: host reimage
  • 13:00 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1431.eqiad.wmnet with reason: host reimage
  • 12:57 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1035.eqiad.wmnet with reason: host reimage
  • 12:56 taavi@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
  • 12:56 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user="Jeff G." . # T357403
  • 12:54 effie: restarting envoy on baremetal mediawiki appservers
  • 12:54 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1035.eqiad.wmnet with reason: host reimage
  • 12:48 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1385.eqiad.wmnet with OS bullseye
  • 12:48 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1453.eqiad.wmnet with OS bullseye
  • 12:48 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1434.eqiad.wmnet with OS bullseye
  • 12:48 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1430.eqiad.wmnet with OS bullseye
  • 12:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P56692 and previous config saved to /var/cache/conftool/dbconfig/20240213-124748-ladsgroup.json
  • 12:47 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1431.eqiad.wmnet with OS bullseye
  • 12:39 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1035.eqiad.wmnet with OS bullseye
  • 12:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P56691 and previous config saved to /var/cache/conftool/dbconfig/20240213-123242-ladsgroup.json
  • 12:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T352010)', diff saved to https://phabricator.wikimedia.org/P56690 and previous config saved to /var/cache/conftool/dbconfig/20240213-121736-ladsgroup.json
  • 12:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T352010)', diff saved to https://phabricator.wikimedia.org/P56689 and previous config saved to /var/cache/conftool/dbconfig/20240213-120035-ladsgroup.json
  • 12:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 12:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 12:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 11:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 11:50 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apifeatureusage2001.codfw.wmnet with OS bullseye
  • 11:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: T350458
  • 11:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: T350458
  • 11:37 cgoubert@deploy2002: Finished scap: Change default maxUnavailable for mw-on-k8s to 10% (duration: 03m 17s)
  • 11:34 cgoubert@deploy2002: Started scap: Change default maxUnavailable for mw-on-k8s to 10%
  • 11:34 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apifeatureusage2001.codfw.wmnet with reason: host reimage
  • 11:33 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 11:32 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 11:32 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 11:31 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on apifeatureusage2001.codfw.wmnet with reason: host reimage
  • 11:31 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 11:27 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 11:24 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 11:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 11:24 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 11:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 11:24 claime: Change default maxUnavailable for mw-on-k8s to 10%
  • 11:21 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host apifeatureusage2001.codfw.wmnet with OS bullseye
  • 11:20 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apifeatureusage1001.eqiad.wmnet with OS bullseye
  • 11:14 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 11:14 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 11:14 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 11:13 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 11:12 gmodena@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 11:11 gmodena@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 11:10 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 11:10 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 11:04 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apifeatureusage1001.eqiad.wmnet with reason: host reimage
  • 11:01 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on apifeatureusage1001.eqiad.wmnet with reason: host reimage
  • 11:01 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2388.codfw.wmnet
  • 11:01 cgoubert@cumin2002: START - Cookbook sre.hosts.remove-downtime for mw2388.codfw.wmnet
  • 10:57 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 10:49 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host apifeatureusage1001.eqiad.wmnet with OS bullseye
  • 10:41 brouberol@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host apifeatureusage1001.eqiad.wmnet with OS bookworm
  • 10:39 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apifeatureusage1001.eqiad.wmnet with reason: host reimage
  • 10:36 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on apifeatureusage1001.eqiad.wmnet with reason: host reimage
  • 10:25 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host apifeatureusage1001.eqiad.wmnet with OS bookworm
  • 10:23 brouberol@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host apifeatureusage1001.eqiad.wmnet with OS bookworm
  • 10:23 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 10:23 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 10:22 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 10:22 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 10:22 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 10:22 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 10:09 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apifeatureusage1001.eqiad.wmnet with reason: host reimage
  • 10:06 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on apifeatureusage1001.eqiad.wmnet with reason: host reimage
  • 10:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddb2002-dev.codfw.wmnet
  • 09:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host clouddb2002-dev.codfw.wmnet
  • 09:57 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host apifeatureusage1001.eqiad.wmnet with OS bookworm
  • 09:23 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 09:22 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 09:22 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 09:22 akosiaris: delete sessionstore pod to force rescheduling
  • 09:21 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 09:20 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 09:20 brouberol@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host apifeatureusage1001.eqiad.wmnet with OS bookworm
  • 09:20 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 09:18 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apifeatureusage1001.eqiad.wmnet with reason: host reimage
  • 09:16 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on apifeatureusage1001.eqiad.wmnet with reason: host reimage
  • 09:04 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host apifeatureusage1001.eqiad.wmnet with OS bookworm
  • 08:28 hashar@deploy2002: Finished scap: Backport for Increase $wgMaxUploadSize to 5 GiB (previously was 4GiB). (T191804) (duration: 08m 57s)
  • 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: grafana
  • 08:21 hashar@deploy2002: hashar and bawolff: Continuing with sync
  • 08:21 hashar@deploy2002: hashar and bawolff: Backport for Increase $wgMaxUploadSize to 5 GiB (previously was 4GiB). (T191804) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:19 hashar@deploy2002: Started scap: Backport for Increase $wgMaxUploadSize to 5 GiB (previously was 4GiB). (T191804)
  • 08:18 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: grafana
  • 07:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 07:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 04:57 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.18 refs T354436 (duration: 52m 36s)
  • 04:04 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.18 refs T354436
  • 04:02 mwpresync@deploy2002: Pruned MediaWiki: 1.42.0-wmf.15 (duration: 02m 09s)
  • 02:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 02:38 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 01:12 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 01:02 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 00:27 eileen: civicrm upgraded from 684286b4 to ac69725f

2024-02-12

  • 23:51 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ncmonitor1001.eqiad.wmnet with OS bookworm
  • 23:25 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user="Yann" . # T357208
  • 23:16 Daimona: T357007 Running mwscript CampaignEvents:GenerateInvitationList --wiki=metawiki --listfile=/home/daimona/list2.txt
  • 23:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncmonitor1001.eqiad.wmnet with OS bookworm
  • 23:03 dzahn@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host etherpad2002.codfw.wmnet
  • 23:03 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host etherpad2002.codfw.wmnet with OS bookworm
  • 22:49 ebernhardson@deploy2002: Finished scap: Backport for Connection: Correct read-only detection (T354793 T356526) (duration: 08m 35s)
  • 22:42 ebernhardson@deploy2002: ebernhardson: Continuing with sync
  • 22:42 ebernhardson@deploy2002: ebernhardson: Backport for Connection: Correct read-only detection (T354793 T356526) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:40 ebernhardson@deploy2002: Started scap: Backport for Connection: Correct read-only detection (T354793 T356526)
  • 22:39 maryum: deployed patch for T357101
  • 22:30 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore2003.codfw.wmnet with reason: Decommissioning — T356828
  • 22:30 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore2003.codfw.wmnet with reason: Decommissioning — T356828
  • 22:30 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore2002.codfw.wmnet with reason: Decommissioning — T356828
  • 22:30 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore2002.codfw.wmnet with reason: Decommissioning — T356828
  • 22:29 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore2001.codfw.wmnet with reason: Decommissioning — T356828
  • 22:29 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore2001.codfw.wmnet with reason: Decommissioning — T356828
  • 22:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 22:26 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 22:20 brett@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host ncmonitor1001.eqiad.wmnet
  • 22:20 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ncmonitor1001.eqiad.wmnet with OS bookworm
  • 22:16 cjming@deploy2002: Finished scap: Backport for MobileFrontend: Set fallback editor to 'visual' on labs (duration: 07m 53s)
  • 22:15 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:15 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:11 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:11 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:11 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:10 cjming@deploy2002: esanders and cjming: Continuing with sync
  • 22:10 cjming@deploy2002: esanders and cjming: Backport for MobileFrontend: Set fallback editor to 'visual' on labs synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:08 cjming@deploy2002: Started scap: Backport for MobileFrontend: Set fallback editor to 'visual' on labs
  • 22:07 cjming@deploy2002: Finished scap: Backport for Make thanks button show again (T357202), Diffs: Localize number in timeago (T357079) (duration: 09m 17s)
  • 22:05 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:04 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:00 cjming@deploy2002: cjming and jdlrobson: Continuing with sync
  • 21:59 cjming@deploy2002: cjming and jdlrobson: Backport for Make thanks button show again (T357202), Diffs: Localize number in timeago (T357079) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:58 cjming@deploy2002: Started scap: Backport for Make thanks button show again (T357202), Diffs: Localize number in timeago (T357079)
  • 21:38 cjming@deploy2002: Finished scap: Backport for Use @wikimedia/mediawiki.skins.clientpreferences@1.1.1 (T357212) (duration: 12m 58s)
  • 21:31 cjming@deploy2002: cjming and jdlrobson: Continuing with sync
  • 21:26 cjming@deploy2002: cjming and jdlrobson: Backport for Use @wikimedia/mediawiki.skins.clientpreferences@1.1.1 (T357212) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:25 cjming@deploy2002: Started scap: Backport for Use @wikimedia/mediawiki.skins.clientpreferences@1.1.1 (T357212)
  • 20:30 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
  • 20:28 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
  • 20:27 eevans@deploy2002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
  • 20:27 eevans@deploy2002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
  • 20:26 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
  • 20:26 eevans@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
  • 19:48 ejegg: fundraising python tools upgraded from c823e692 to 2d164db5
  • 19:28 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on etherpad2002.codfw.wmnet with reason: host reimage
  • 19:25 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on etherpad2002.codfw.wmnet with reason: host reimage
  • 19:08 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host etherpad2002.codfw.wmnet with OS bookworm
  • 19:07 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM etherpad2002.codfw.wmnet - dzahn@cumin1002"
  • 19:06 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM etherpad2002.codfw.wmnet - dzahn@cumin1002"
  • 19:06 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) etherpad2002.codfw.wmnet on all recursors
  • 19:05 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache etherpad2002.codfw.wmnet on all recursors
  • 19:05 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:05 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM etherpad2002.codfw.wmnet - dzahn@cumin1002"
  • 19:04 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM etherpad2002.codfw.wmnet - dzahn@cumin1002"
  • 19:02 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 19:02 dzahn@cumin1002: START - Cookbook sre.ganeti.makevm for new host etherpad2002.codfw.wmnet
  • 18:58 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync - dzahn@cumin1002"
  • 18:57 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync - dzahn@cumin1002"
  • 18:56 dzahn@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host etherpad2002.codfw.wmnet
  • 18:56 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) etherpad2002.codfw.wmnet on all recursors
  • 18:56 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache etherpad2002.codfw.wmnet on all recursors
  • 18:56 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:56 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM etherpad2002.codfw.wmnet - dzahn@cumin1002"
  • 18:55 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM etherpad2002.codfw.wmnet - dzahn@cumin1002"
  • 18:55 mutante: attempt to create a completely new VM with a new name ALSO FAILS and removes DNS entries
  • 18:53 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 18:53 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) etherpad2002.codfw.wmnet on all recursors
  • 18:53 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache etherpad2002.codfw.wmnet on all recursors
  • 18:53 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:53 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM etherpad2002.codfw.wmnet - dzahn@cumin1002"
  • 18:52 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM etherpad2002.codfw.wmnet - dzahn@cumin1002"
  • 18:48 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 18:48 dzahn@cumin1002: START - Cookbook sre.ganeti.makevm for new host etherpad2002.codfw.wmnet
  • 18:42 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 13:00:00 on db1133.eqiad.wmnet with reason: hush
  • 18:42 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 13:00:00 on db1133.eqiad.wmnet with reason: hush
  • 18:37 dzahn@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host etherpad2001.codfw.wmnet
  • 18:37 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) etherpad2001.codfw.wmnet on all recursors
  • 18:37 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache etherpad2001.codfw.wmnet on all recursors
  • 18:37 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:37 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 18:36 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 18:35 mutante: attempting decom cookbook on "unverified" host etherpad2001, followed by makevm cookbook to create it again to get out of the cycle of adding and removing DNS records - fails with "is already in the cluster" even after decom finished T357159
  • 18:34 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 18:34 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) etherpad2001.codfw.wmnet on all recursors
  • 18:34 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache etherpad2001.codfw.wmnet on all recursors
  • 18:34 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:33 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 18:33 dzahn@cumin1002: START - Cookbook sre.ganeti.makevm for new host etherpad2001.codfw.wmnet
  • 18:29 mutante: makevm cookbook creates and then removes DNS records, sync-netbox-hiera cookbook fails with raise NetboxError(f"Server {self._server.name} does not have any primary IP with a DNS name set.")
  • 18:29 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync - dzahn@cumin1002"
  • 18:28 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync - dzahn@cumin1002"
  • 18:25 dzahn@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host etherpad2001.codfw.wmnet
  • 18:25 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) etherpad2001.codfw.wmnet on all recursors
  • 18:25 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache etherpad2001.codfw.wmnet on all recursors
  • 18:25 dzahn@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 18:24 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 18:23 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) etherpad2001.codfw.wmnet on all recursors
  • 18:23 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache etherpad2001.codfw.wmnet on all recursors
  • 18:23 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:23 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 18:22 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 18:20 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 18:20 dzahn@cumin1002: START - Cookbook sre.ganeti.makevm for new host etherpad2001.codfw.wmnet
  • 18:19 dzahn@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host etherpad2001.codfw.wmnet
  • 18:19 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) etherpad2001.codfw.wmnet on all recursors
  • 18:19 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache etherpad2001.codfw.wmnet on all recursors
  • 18:18 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:18 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 18:18 mutante: makevm cookbook in a cycle of adding and then removing DNS records
  • 18:17 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 18:16 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw2388.codfw.wmnet with reason: Envoy config changed for ipoid
  • 18:16 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw2388.codfw.wmnet with reason: Envoy config changed for ipoid
  • 18:16 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 18:15 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) etherpad2001.codfw.wmnet on all recursors
  • 18:15 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache etherpad2001.codfw.wmnet on all recursors
  • 18:15 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:15 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 18:14 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 18:12 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 18:12 dzahn@cumin1002: START - Cookbook sre.ganeti.makevm for new host etherpad2001.codfw.wmnet
  • 18:11 mutante: spicerack.netbox.NetboxError: Server etherpad2001 does not have any primary IP with a DNS name set.
  • 18:08 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 18:08 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 18:07 dzahn@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host etherpad2001.codfw.wmnet
  • 18:07 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) etherpad2001.codfw.wmnet on all recursors
  • 18:07 ladsgroup@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 18:07 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache etherpad2001.codfw.wmnet on all recursors
  • 18:06 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:06 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 18:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 18:05 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 18:03 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 18:03 dzahn@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 17:59 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 17:58 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) etherpad2001.codfw.wmnet on all recursors
  • 17:58 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache etherpad2001.codfw.wmnet on all recursors
  • 17:58 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:57 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 17:55 dzahn@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 17:55 dzahn@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 17:46 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM etherpad2001.codfw.wmnet - dzahn@cumin1002"
  • 17:42 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 17:42 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
  • 17:41 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 17:41 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 17:37 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 17:35 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 17:35 dzahn@cumin1002: START - Cookbook sre.ganeti.makevm for new host etherpad2001.codfw.wmnet
  • 17:17 dzahn@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 17:15 dzahn@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 17:15 dzahn@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 17:13 dzahn@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 17:07 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 17:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 100%: testing db2194 done', diff saved to https://phabricator.wikimedia.org/P56686 and previous config saved to /var/cache/conftool/dbconfig/20240212-170423-arnaudb.json
  • 16:56 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 16:54 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 07m 00s)
  • 16:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 75%: testing db2194 done', diff saved to https://phabricator.wikimedia.org/P56685 and previous config saved to /var/cache/conftool/dbconfig/20240212-164918-arnaudb.json
  • 16:48 jgiannelos@deploy2002: Finished deploy [restbase/deploy@228b93d]: (no justification provided) (duration: 16m 16s)
  • 16:47 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 07m 07s)
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 50%: testing db2194 done', diff saved to https://phabricator.wikimedia.org/P56684 and previous config saved to /var/cache/conftool/dbconfig/20240212-163413-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Removing instances as per T350458', diff saved to https://phabricator.wikimedia.org/P56683 and previous config saved to /var/cache/conftool/dbconfig/20240212-163407-arnaudb.json
  • 16:32 jgiannelos@deploy2002: Started deploy [restbase/deploy@228b93d]: (no justification provided)
  • 16:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Disable JSON Dump tests to prepare for schema change in Wikibase (T305660), Return stdClass/Object from Serializers for empty lists (T305660), Change expected serialization format of JSON dumps to include arrays (T305660) (duration: 09m 42s)
  • 16:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
  • 16:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for Disable JSON Dump tests to prepare for schema change in Wikibase (T305660), Return stdClass/Object from Serializers for empty lists (T305660), Change expected serialization format of JSON dumps to include arrays (T305660) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:13 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Disable JSON Dump tests to prepare for schema change in Wikibase (T305660), Return stdClass/Object from Serializers for empty lists (T305660), Change expected serialization format of JSON dumps to include arrays (T305660)
  • 15:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 40%: testing db2194 done', diff saved to https://phabricator.wikimedia.org/P56682 and previous config saved to /var/cache/conftool/dbconfig/20240212-155325-arnaudb.json
  • 15:51 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
  • 15:50 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
  • 15:48 eevans@deploy2002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
  • 15:48 eevans@deploy2002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
  • 15:46 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
  • 15:46 eevans@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
  • 15:42 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 15:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 30%: testing db2194 done', diff saved to https://phabricator.wikimedia.org/P56681 and previous config saved to /var/cache/conftool/dbconfig/20240212-153820-arnaudb.json
  • 15:36 denisse: Failover Back to grafana1002 - T352665
  • 15:36 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2194.codfw.wmnet with OS bookworm
  • 15:34 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host grafana1002.eqiad.wmnet with OS bookworm
  • 15:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 20%: testing db2194 done', diff saved to https://phabricator.wikimedia.org/P56680 and previous config saved to /var/cache/conftool/dbconfig/20240212-152315-arnaudb.json
  • 15:19 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on grafana1002.eqiad.wmnet with reason: host reimage
  • 15:16 denisse@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on grafana1002.eqiad.wmnet with reason: host reimage
  • 15:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2194.codfw.wmnet with reason: host reimage
  • 15:11 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2194.codfw.wmnet with reason: host reimage
  • 15:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 10%: testing db2194 done', diff saved to https://phabricator.wikimedia.org/P56679 and previous config saved to /var/cache/conftool/dbconfig/20240212-150810-arnaudb.json
  • 15:08 denisse@cumin2002: START - Cookbook sre.hosts.reimage for host grafana1002.eqiad.wmnet with OS bookworm
  • 15:07 denisse: Reimage Standby Host (grafana1002) - T352665
  • 15:06 ejegg: re-enabled thank you mailer and donations queue consumer
  • 15:03 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
  • 14:56 taavi@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
  • 14:56 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
  • 14:51 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2194.codfw.wmnet with OS bookworm
  • 14:49 taavi@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
  • 14:47 denisse: Completed failover to grafana2001 - T352665
  • 14:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2180.codfw.wmnet onto db2194.codfw.wmnet
  • 14:42 ejegg: fundraising civicrm upgraded from c66b04bd to 684286b4
  • 14:41 ejegg: disabled thank you mailer and donations queue consumer
  • 14:36 denisse: starting Upgrade Grafana hosts to Bookworm - T352665
  • 14:33 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for uzwiki: remove temporary logo files (T353723) (duration: 09m 53s)
  • 14:27 moritzm: installing Linux 6.1.76 on Bookworm hosts
  • 14:26 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Continuing with sync
  • 14:24 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Backport for uzwiki: remove temporary logo files (T353723) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for uzwiki: remove temporary logo files (T353723)
  • 14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Set $wgMinervaEnableSiteNotice for arwikisource (T356460) (duration: 09m 05s)
  • 14:13 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and hubaishan: Continuing with sync
  • 14:12 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and hubaishan: Backport for Set $wgMinervaEnableSiteNotice for arwikisource (T356460) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:10 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Set $wgMinervaEnableSiteNotice for arwikisource (T356460)
  • 14:02 ladsgroup@deploy2002: Finished scap: Backport for Stop writing to old pagelinks schema in s4 (T352010) (duration: 23m 12s)
  • 13:57 taavi@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol2005-dev.codfw.wmnet
  • 13:55 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 13:49 ladsgroup@deploy2002: ladsgroup: Backport for Stop writing to old pagelinks schema in s4 (T352010) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:44 taavi@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
  • 13:44 taavi@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol2004-dev.codfw.wmnet
  • 13:39 ladsgroup@deploy2002: Started scap: Backport for Stop writing to old pagelinks schema in s4 (T352010)
  • 13:35 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2180.codfw.wmnet onto db2194.codfw.wmnet
  • 13:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2180.codfw.wmnet onto db2194.codfw.wmnet
  • 13:34 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2180.codfw.wmnet onto db2194.codfw.wmnet
  • 13:32 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2180.codfw.wmnet onto db2194.codfw.wmnet
  • 13:31 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2180.codfw.wmnet onto db2194.codfw.wmnet
  • 13:27 taavi@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol2001-dev.codfw.wmnet
  • 13:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2180.codfw.wmnet onto db2194.codfw.wmnet
  • 13:21 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2180.codfw.wmnet onto db2194.codfw.wmnet
  • 12:55 taavi@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2001-dev.codfw.wmnet
  • 12:52 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1002-dev.eqiad.wmnet
  • 12:48 taavi@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1002-dev.eqiad.wmnet
  • 12:47 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1001-dev.eqiad.wmnet
  • 12:43 taavi@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1001-dev.eqiad.wmnet
  • 12:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 12:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 12:19 brouberol@cumin1002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for stat1005.eqiad.wmnet: Renew puppet certificate - brouberol@cumin1002
  • 12:17 brouberol@cumin1002: START - Cookbook sre.puppet.renew-cert for stat1005.eqiad.wmnet: Renew puppet certificate - brouberol@cumin1002
  • 12:14 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host restbase1035.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:14 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1040.eqiad.wmnet with OS bullseye
  • 12:14 isaranto@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 12:14 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 12:14 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1034.eqiad.wmnet with OS bullseye
  • 12:14 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 12:14 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1037.eqiad.wmnet with OS bullseye
  • 12:14 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 12:14 isaranto@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 12:13 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1038.eqiad.wmnet with OS bullseye
  • 12:13 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 12:13 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1039.eqiad.wmnet with OS bullseye
  • 12:13 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 12:13 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1041.eqiad.wmnet with OS bullseye
  • 12:13 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 12:13 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1042.eqiad.wmnet with OS bullseye
  • 12:13 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 12:04 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36236
  • 12:02 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 36236
  • 12:01 taavi: taavi@gerrit1003 ~ $ sudo systemctl restart apache2
  • 11:48 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki frwiki --current --all --touched-after=20230613000000 --start '["7544396"]' 2>&1 | tee ~/T315510-frwiki # in tmux
  • 11:46 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki viwiki --current --all --touched-after=20230613000000 2>&1 | tee ~/T315510-viwiki # in tmux
  • 10:20 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 138881
  • 10:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2190 T343674', diff saved to https://phabricator.wikimedia.org/P56677 and previous config saved to /var/cache/conftool/dbconfig/20240212-102046-arnaudb.json
  • 10:20 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2180.codfw.wmnet with reason: T343674 testing cloning a single instance node to a multi-instance one
  • 10:19 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 138881
  • 10:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2180.codfw.wmnet with reason: T343674 testing cloning a single instance node to a multi-instance one
  • 10:08 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Remove legacy codfw vc switches from synced hiera data after netbox status change - cmooney@cumin1002 - T355544"
  • 10:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P56676 and previous config saved to /var/cache/conftool/dbconfig/20240212-100655-ladsgroup.json
  • 10:06 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Remove legacy codfw vc switches from synced hiera data after netbox status change - cmooney@cumin1002 - T355544"
  • 09:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P56675 and previous config saved to /var/cache/conftool/dbconfig/20240212-095150-ladsgroup.json
  • 09:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P56674 and previous config saved to /var/cache/conftool/dbconfig/20240212-093645-ladsgroup.json
  • 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1002.eqiad.wmnet
  • 09:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1002.eqiad.wmnet
  • 09:21 moritzm: restarting archiva to pick up Java security updates
  • 09:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P56672 and previous config saved to /var/cache/conftool/dbconfig/20240212-092140-ladsgroup.json
  • 09:16 moritzm: installing java 8 security updates on Buster
  • 08:58 taavi@cumin1002: conftool action : set/pooled=yes; selector: name=cloudweb1003.wikimedia.org
  • 08:54 taavi@cumin1002: conftool action : set/pooled=no; selector: name=cloudweb1003.wikimedia.org
  • 08:54 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudweb1003.wikimedia.org with OS bullseye
  • 08:52 hashar@deploy2002: Finished deploy [gerrit/gerrit@db69b2b]: Bump javascript from es2018 to es2020 (duration: 00m 07s)
  • 08:52 hashar@deploy2002: Started deploy [gerrit/gerrit@db69b2b]: Bump javascript from es2018 to es2020
  • 08:26 hashar@deploy2002: Finished deploy [integration/docroot@2360fa1]: Updating eslint-config-wikimedia and mediawiki-phan-config (duration: 00m 06s)
  • 08:26 hashar@deploy2002: Started deploy [integration/docroot@2360fa1]: Updating eslint-config-wikimedia and mediawiki-phan-config
  • 08:26 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudweb1003.wikimedia.org with reason: host reimage
  • 08:23 moritzm: update netboot image for Bookworm 12.5 point release T357133
  • 08:23 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudweb1003.wikimedia.org with reason: host reimage
  • 08:11 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudweb1003.wikimedia.org with OS bullseye
  • 08:11 XioNoX: set esams NL-IX peering as primary
  • 08:10 moritzm: update netboot image for Bullseye 11.9 point release T357144
  • 08:03 taavi@cumin1002: conftool action : set/pooled=inactive; selector: name=cloudweb1003.wikimedia.org
  • 07:44 vgutierrez: upload golang-github-u-root-u-root_0.11.0 to apt.wm.o (bookworm)

2024-02-11

  • 19:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T352010)', diff saved to https://phabricator.wikimedia.org/P56670 and previous config saved to /var/cache/conftool/dbconfig/20240211-195509-ladsgroup.json
  • 19:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P56669 and previous config saved to /var/cache/conftool/dbconfig/20240211-194002-ladsgroup.json
  • 19:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P56668 and previous config saved to /var/cache/conftool/dbconfig/20240211-192455-ladsgroup.json
  • 19:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T352010)', diff saved to https://phabricator.wikimedia.org/P56667 and previous config saved to /var/cache/conftool/dbconfig/20240211-190948-ladsgroup.json
  • 16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2192 (T352010)', diff saved to https://phabricator.wikimedia.org/P56666 and previous config saved to /var/cache/conftool/dbconfig/20240211-165910-ladsgroup.json
  • 16:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 16:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 16:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T352010)', diff saved to https://phabricator.wikimedia.org/P56665 and previous config saved to /var/cache/conftool/dbconfig/20240211-165848-ladsgroup.json
  • 16:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P56664 and previous config saved to /var/cache/conftool/dbconfig/20240211-164341-ladsgroup.json
  • 16:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P56663 and previous config saved to /var/cache/conftool/dbconfig/20240211-162834-ladsgroup.json
  • 16:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T352010)', diff saved to https://phabricator.wikimedia.org/P56662 and previous config saved to /var/cache/conftool/dbconfig/20240211-161328-ladsgroup.json
  • 13:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2178 (T352010)', diff saved to https://phabricator.wikimedia.org/P56661 and previous config saved to /var/cache/conftool/dbconfig/20240211-132638-ladsgroup.json
  • 13:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 13:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 13:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T352010)', diff saved to https://phabricator.wikimedia.org/P56660 and previous config saved to /var/cache/conftool/dbconfig/20240211-132617-ladsgroup.json
  • 13:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P56659 and previous config saved to /var/cache/conftool/dbconfig/20240211-131110-ladsgroup.json
  • 12:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P56658 and previous config saved to /var/cache/conftool/dbconfig/20240211-125603-ladsgroup.json
  • 12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T352010)', diff saved to https://phabricator.wikimedia.org/P56657 and previous config saved to /var/cache/conftool/dbconfig/20240211-124057-ladsgroup.json
  • 09:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3315 (T352010)', diff saved to https://phabricator.wikimedia.org/P56656 and previous config saved to /var/cache/conftool/dbconfig/20240211-094158-ladsgroup.json
  • 09:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 09:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 09:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T352010)', diff saved to https://phabricator.wikimedia.org/P56655 and previous config saved to /var/cache/conftool/dbconfig/20240211-094136-ladsgroup.json
  • 09:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P56654 and previous config saved to /var/cache/conftool/dbconfig/20240211-092630-ladsgroup.json
  • 09:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P56653 and previous config saved to /var/cache/conftool/dbconfig/20240211-091123-ladsgroup.json
  • 08:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T352010)', diff saved to https://phabricator.wikimedia.org/P56652 and previous config saved to /var/cache/conftool/dbconfig/20240211-085616-ladsgroup.json
  • 05:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2157 (T352010)', diff saved to https://phabricator.wikimedia.org/P56651 and previous config saved to /var/cache/conftool/dbconfig/20240211-055427-ladsgroup.json
  • 05:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 05:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 05:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T352010)', diff saved to https://phabricator.wikimedia.org/P56650 and previous config saved to /var/cache/conftool/dbconfig/20240211-055405-ladsgroup.json
  • 05:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P56649 and previous config saved to /var/cache/conftool/dbconfig/20240211-053858-ladsgroup.json
  • 05:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P56648 and previous config saved to /var/cache/conftool/dbconfig/20240211-052352-ladsgroup.json
  • 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T352010)', diff saved to https://phabricator.wikimedia.org/P56647 and previous config saved to /var/cache/conftool/dbconfig/20240211-050845-ladsgroup.json
  • 01:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3315 (T352010)', diff saved to https://phabricator.wikimedia.org/P56646 and previous config saved to /var/cache/conftool/dbconfig/20240211-015257-ladsgroup.json
  • 01:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 01:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 01:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T352010)', diff saved to https://phabricator.wikimedia.org/P56645 and previous config saved to /var/cache/conftool/dbconfig/20240211-015236-ladsgroup.json
  • 01:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P56644 and previous config saved to /var/cache/conftool/dbconfig/20240211-013729-ladsgroup.json
  • 01:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P56643 and previous config saved to /var/cache/conftool/dbconfig/20240211-012222-ladsgroup.json
  • 01:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T352010)', diff saved to https://phabricator.wikimedia.org/P56642 and previous config saved to /var/cache/conftool/dbconfig/20240211-010715-ladsgroup.json

2024-02-10

  • 21:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2128 (T352010)', diff saved to https://phabricator.wikimedia.org/P56641 and previous config saved to /var/cache/conftool/dbconfig/20240210-215952-ladsgroup.json
  • 21:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 21:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 21:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 21:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 21:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T352010)', diff saved to https://phabricator.wikimedia.org/P56640 and previous config saved to /var/cache/conftool/dbconfig/20240210-215913-ladsgroup.json
  • 21:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P56639 and previous config saved to /var/cache/conftool/dbconfig/20240210-214405-ladsgroup.json
  • 21:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P56638 and previous config saved to /var/cache/conftool/dbconfig/20240210-212859-ladsgroup.json
  • 21:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T352010)', diff saved to https://phabricator.wikimedia.org/P56637 and previous config saved to /var/cache/conftool/dbconfig/20240210-211352-ladsgroup.json
  • 19:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1212 (T352010)', diff saved to https://phabricator.wikimedia.org/P56636 and previous config saved to /var/cache/conftool/dbconfig/20240210-192353-ladsgroup.json
  • 19:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 19:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 19:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 19:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 19:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T352010)', diff saved to https://phabricator.wikimedia.org/P56635 and previous config saved to /var/cache/conftool/dbconfig/20240210-192312-ladsgroup.json
  • 19:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P56634 and previous config saved to /var/cache/conftool/dbconfig/20240210-190805-ladsgroup.json
  • 18:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P56633 and previous config saved to /var/cache/conftool/dbconfig/20240210-185258-ladsgroup.json
  • 18:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T352010)', diff saved to https://phabricator.wikimedia.org/P56632 and previous config saved to /var/cache/conftool/dbconfig/20240210-183752-ladsgroup.json
  • 18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2123 (T352010)', diff saved to https://phabricator.wikimedia.org/P56631 and previous config saved to /var/cache/conftool/dbconfig/20240210-181424-ladsgroup.json
  • 18:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 18:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T352010)', diff saved to https://phabricator.wikimedia.org/P56630 and previous config saved to /var/cache/conftool/dbconfig/20240210-181403-ladsgroup.json
  • 17:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P56629 and previous config saved to /var/cache/conftool/dbconfig/20240210-175856-ladsgroup.json
  • 17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P56628 and previous config saved to /var/cache/conftool/dbconfig/20240210-174349-ladsgroup.json
  • 17:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T352010)', diff saved to https://phabricator.wikimedia.org/P56627 and previous config saved to /var/cache/conftool/dbconfig/20240210-172843-ladsgroup.json
  • 14:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2111 (T352010)', diff saved to https://phabricator.wikimedia.org/P56626 and previous config saved to /var/cache/conftool/dbconfig/20240210-140241-ladsgroup.json
  • 14:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 14:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 11:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 (T352010)', diff saved to https://phabricator.wikimedia.org/P56625 and previous config saved to /var/cache/conftool/dbconfig/20240210-112150-ladsgroup.json
  • 11:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 11:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 11:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T352010)', diff saved to https://phabricator.wikimedia.org/P56624 and previous config saved to /var/cache/conftool/dbconfig/20240210-112129-ladsgroup.json
  • 11:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P56623 and previous config saved to /var/cache/conftool/dbconfig/20240210-110622-ladsgroup.json
  • 10:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P56622 and previous config saved to /var/cache/conftool/dbconfig/20240210-105116-ladsgroup.json
  • 10:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 10:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 10:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T352010)', diff saved to https://phabricator.wikimedia.org/P56621 and previous config saved to /var/cache/conftool/dbconfig/20240210-103609-ladsgroup.json
  • 08:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 08:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 05:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 05:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 05:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1244:3315 (T352010)', diff saved to https://phabricator.wikimedia.org/P56620 and previous config saved to /var/cache/conftool/dbconfig/20240210-054721-ladsgroup.json
  • 05:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1244:3315', diff saved to https://phabricator.wikimedia.org/P56619 and previous config saved to /var/cache/conftool/dbconfig/20240210-053215-ladsgroup.json
  • 05:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1244:3315', diff saved to https://phabricator.wikimedia.org/P56618 and previous config saved to /var/cache/conftool/dbconfig/20240210-051708-ladsgroup.json
  • 05:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1244:3315 (T352010)', diff saved to https://phabricator.wikimedia.org/P56617 and previous config saved to /var/cache/conftool/dbconfig/20240210-050201-ladsgroup.json
  • 03:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 03:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 03:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T352010)', diff saved to https://phabricator.wikimedia.org/P56615 and previous config saved to /var/cache/conftool/dbconfig/20240210-032801-ladsgroup.json
  • 03:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P56614 and previous config saved to /var/cache/conftool/dbconfig/20240210-031255-ladsgroup.json
  • 02:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P56613 and previous config saved to /var/cache/conftool/dbconfig/20240210-025748-ladsgroup.json
  • 02:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T352010)', diff saved to https://phabricator.wikimedia.org/P56612 and previous config saved to /var/cache/conftool/dbconfig/20240210-024242-ladsgroup.json
  • 02:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1244:3315 (T352010)', diff saved to https://phabricator.wikimedia.org/P56611 and previous config saved to /var/cache/conftool/dbconfig/20240210-021141-ladsgroup.json
  • 02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 02:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T352010)', diff saved to https://phabricator.wikimedia.org/P56610 and previous config saved to /var/cache/conftool/dbconfig/20240210-021119-ladsgroup.json
  • 01:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P56609 and previous config saved to /var/cache/conftool/dbconfig/20240210-015612-ladsgroup.json
  • 01:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P56608 and previous config saved to /var/cache/conftool/dbconfig/20240210-014106-ladsgroup.json
  • 01:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T352010)', diff saved to https://phabricator.wikimedia.org/P56607 and previous config saved to /var/cache/conftool/dbconfig/20240210-012559-ladsgroup.json

2024-02-09

  • 23:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1230 (T352010)', diff saved to https://phabricator.wikimedia.org/P56606 and previous config saved to /var/cache/conftool/dbconfig/20240209-230425-ladsgroup.json
  • 23:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 23:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 21:39 inflatador: bking@deploy2002 install 'python3-boto3' pkg T348685
  • 21:36 inflatador: bking@deploy2002 install 'python3-plac' pkg T348685
  • 21:09 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new systemd settings - bking@cumin2002 - T355617
  • 21:06 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: apply new systemd settings - bking@cumin2002 - T355617
  • 20:55 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: apply new systemd settings - bking@cumin2002 - T355617
  • 20:46 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: apply new systemd settings - bking@cumin2002 - T355617
  • 20:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 20:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 20:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T352010)', diff saved to https://phabricator.wikimedia.org/P56605 and previous config saved to /var/cache/conftool/dbconfig/20240209-202830-ladsgroup.json
  • 20:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P56604 and previous config saved to /var/cache/conftool/dbconfig/20240209-201324-ladsgroup.json
  • 19:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P56603 and previous config saved to /var/cache/conftool/dbconfig/20240209-195817-ladsgroup.json
  • 19:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T352010)', diff saved to https://phabricator.wikimedia.org/P56602 and previous config saved to /var/cache/conftool/dbconfig/20240209-194310-ladsgroup.json
  • 19:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T352010)', diff saved to https://phabricator.wikimedia.org/P56601 and previous config saved to /var/cache/conftool/dbconfig/20240209-193452-ladsgroup.json
  • 19:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 19:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T352010)', diff saved to https://phabricator.wikimedia.org/P56600 and previous config saved to /var/cache/conftool/dbconfig/20240209-193430-ladsgroup.json
  • 19:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P56599 and previous config saved to /var/cache/conftool/dbconfig/20240209-191923-ladsgroup.json
  • 19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P56598 and previous config saved to /var/cache/conftool/dbconfig/20240209-190416-ladsgroup.json
  • 18:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T352010)', diff saved to https://phabricator.wikimedia.org/P56597 and previous config saved to /var/cache/conftool/dbconfig/20240209-184910-ladsgroup.json
  • 18:49 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host etherpad1004.eqiad.wmnet with OS bookworm
  • 18:39 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 18:38 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 18:37 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 18:37 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 18:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1008.eqiad.wmnet with OS bullseye
  • 18:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 18:36 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 18:36 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 18:35 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 18:35 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 18:35 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on etherpad1004.eqiad.wmnet with reason: host reimage
  • 18:32 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 18:32 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on etherpad1004.eqiad.wmnet with reason: host reimage
  • 18:19 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host etherpad1004.eqiad.wmnet with OS bookworm
  • 18:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1008.eqiad.wmnet with reason: host reimage
  • 18:11 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1008.eqiad.wmnet with reason: host reimage
  • 17:54 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1008.eqiad.wmnet with OS bullseye
  • 17:43 dzahn@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host etherpad1004.eqiad.wmnet
  • 17:43 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host etherpad1004.eqiad.wmnet with OS bookworm
  • 17:43 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host etherpad1004.eqiad.wmnet with OS bookworm
  • 17:41 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM etherpad1004.eqiad.wmnet - dzahn@cumin1002"
  • 17:41 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM etherpad1004.eqiad.wmnet - dzahn@cumin1002"
  • 17:40 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) etherpad1004.eqiad.wmnet on all recursors
  • 17:40 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache etherpad1004.eqiad.wmnet on all recursors
  • 17:40 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:40 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM etherpad1004.eqiad.wmnet - dzahn@cumin1002"
  • 17:39 mutante: merging netbox/hiera data changes that add restbase hosts and show up when I run unrelated cookbook creating a new VM - T354893
  • 17:35 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM etherpad1004.eqiad.wmnet - dzahn@cumin1002"
  • 17:30 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 17:30 dzahn@cumin1002: START - Cookbook sre.ganeti.makevm for new host etherpad1004.eqiad.wmnet
  • 17:18 cdanis: rolling restart of pods on k8s aux eqiad T356661
  • 17:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1213:3315 (T352010)', diff saved to https://phabricator.wikimedia.org/P56594 and previous config saved to /var/cache/conftool/dbconfig/20240209-171225-ladsgroup.json
  • 17:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 17:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 17:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T352010)', diff saved to https://phabricator.wikimedia.org/P56593 and previous config saved to /var/cache/conftool/dbconfig/20240209-171203-ladsgroup.json
  • 17:11 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for color-link-visited was not defined (T356928) (duration: 13m 13s)
  • 17:04 logmsgbot: lucaswerkmeister-wmde@deploy2002 jdlrobson and lucaswerkmeister-wmde: Continuing with sync
  • 16:59 logmsgbot: lucaswerkmeister-wmde@deploy2002 jdlrobson and lucaswerkmeister-wmde: Backport for color-link-visited was not defined (T356928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:57 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for color-link-visited was not defined (T356928)
  • 16:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P56592 and previous config saved to /var/cache/conftool/dbconfig/20240209-165657-ladsgroup.json
  • 16:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P56591 and previous config saved to /var/cache/conftool/dbconfig/20240209-164150-ladsgroup.json
  • 16:39 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:34 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:29 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T352010)', diff saved to https://phabricator.wikimedia.org/P56590 and previous config saved to /var/cache/conftool/dbconfig/20240209-162643-ladsgroup.json
  • 16:23 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:18 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:13 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:59 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1035.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:59 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase1035.eqiad.wmnet with OS bullseye
  • 15:58 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1035.eqiad.wmnet with OS bullseye
  • 15:51 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:06 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic1005*,cloudelastic1006*,cloudelastic1007*,cloudelastic1008* for IP migration - bking@cumin2002 - T355617
  • 15:05 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1005*,cloudelastic1006*,cloudelastic1007*,cloudelastic1008* for IP migration - bking@cumin2002 - T355617
  • 14:34 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase1035.eqiad.wmnet with OS bullseye
  • 14:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset: apply
  • 14:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply
  • 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1210 (T352010)', diff saved to https://phabricator.wikimedia.org/P56588 and previous config saved to /var/cache/conftool/dbconfig/20240209-135337-ladsgroup.json
  • 13:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T352010)', diff saved to https://phabricator.wikimedia.org/P56587 and previous config saved to /var/cache/conftool/dbconfig/20240209-135315-ladsgroup.json
  • 13:48 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply
  • 13:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:47 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1040.eqiad.wmnet with reason: host reimage
  • 13:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:46 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 13:44 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1041.eqiad.wmnet with reason: host reimage
  • 13:42 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1037.eqiad.wmnet with reason: host reimage
  • 13:39 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1042.eqiad.wmnet with reason: host reimage
  • 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host puppetserver2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 13:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P56586 and previous config saved to /var/cache/conftool/dbconfig/20240209-133809-ladsgroup.json
  • 13:36 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1038.eqiad.wmnet with reason: host reimage
  • 13:34 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1039.eqiad.wmnet with reason: host reimage
  • 13:32 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1040.eqiad.wmnet with reason: host reimage
  • 13:32 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1041.eqiad.wmnet with reason: host reimage
  • 13:32 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1042.eqiad.wmnet with reason: host reimage
  • 13:31 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1034.eqiad.wmnet with reason: host reimage
  • 13:31 topranks: enabling BGP peering to NL-IX (new IXP connection) route servers from cr2-esams T322630
  • 13:30 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1039.eqiad.wmnet with reason: host reimage
  • 13:29 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1037.eqiad.wmnet with reason: host reimage
  • 13:29 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1038.eqiad.wmnet with reason: host reimage
  • 13:26 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1034.eqiad.wmnet with reason: host reimage
  • 13:25 jmm@cumin2002: START - Cookbook sre.hosts.provision for host puppetserver2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P56585 and previous config saved to /var/cache/conftool/dbconfig/20240209-132302-ladsgroup.json
  • 13:17 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1042.eqiad.wmnet with OS bullseye
  • 13:16 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1041.eqiad.wmnet with OS bullseye
  • 13:16 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1040.eqiad.wmnet with OS bullseye
  • 13:15 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1039.eqiad.wmnet with OS bullseye
  • 13:14 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1038.eqiad.wmnet with OS bullseye
  • 13:14 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1037.eqiad.wmnet with OS bullseye
  • 13:14 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1035.eqiad.wmnet with OS bullseye
  • 13:11 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host restbase1034.eqiad.wmnet with OS bullseye
  • 13:08 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['restbase1039']
  • 13:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T352010)', diff saved to https://phabricator.wikimedia.org/P56584 and previous config saved to /var/cache/conftool/dbconfig/20240209-130755-ladsgroup.json
  • 13:07 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['restbase1039']
  • 13:07 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['restbase1034']
  • 13:07 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['restbase1034']
  • 13:06 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:06 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for puppetserver2003 - cmooney@cumin1002"
  • 13:05 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for puppetserver2003 - cmooney@cumin1002"
  • 13:03 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['restbase1034']
  • 13:03 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['restbase1034']
  • 13:02 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 13:00 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase1035.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:00 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase1040.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:00 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase1041.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:00 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase1034.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:59 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase1038.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:59 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase1037.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:58 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase1039.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:57 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase1042.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:50 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 12:49 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 12:49 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 12:48 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 12:47 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 12:47 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 12:47 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 12:45 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 12:45 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 12:44 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 12:43 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1044
  • 12:36 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster2003 rename - jmm@cumin2002"
  • 12:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster2003 rename - jmm@cumin2002"
  • 12:35 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1034.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:35 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1035.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 12:32 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1037.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:31 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host restbase1037.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:31 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1037.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:30 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1039.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:30 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1040.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:28 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1041.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:27 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host restbase1041.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:27 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host restbase1040.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:27 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host restbase1037.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:26 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host restbase1039.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:26 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1037.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:26 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1038.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:26 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1039.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:26 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1040.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:26 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1041.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:26 jclark@cumin1002: START - Cookbook sre.hosts.provision for host restbase1042.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:24 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1044
  • 12:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be1044.eqiad.wmnet
  • 12:04 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be1044.eqiad.wmnet
  • 11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T352010)', diff saved to https://phabricator.wikimedia.org/P56583 and previous config saved to /var/cache/conftool/dbconfig/20240209-114208-ladsgroup.json
  • 11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 11:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 11:41 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 11:40 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 11:39 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 11:39 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 11:32 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 11:31 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 11:30 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 11:30 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 11:26 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 11:25 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1200 (T352010)', diff saved to https://phabricator.wikimedia.org/P56582 and previous config saved to /var/cache/conftool/dbconfig/20240209-102336-ladsgroup.json
  • 10:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 10:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T352010)', diff saved to https://phabricator.wikimedia.org/P56581 and previous config saved to /var/cache/conftool/dbconfig/20240209-102314-ladsgroup.json
  • 10:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P56580 and previous config saved to /var/cache/conftool/dbconfig/20240209-100808-ladsgroup.json
  • 09:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2194.codfw.wmnet with OS bookworm
  • 09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P56579 and previous config saved to /var/cache/conftool/dbconfig/20240209-095301-ladsgroup.json
  • 09:46 moritzm: uploaded openjdk-8 8u402-ga-2~deb10u1 for buster-wikimedia (backport of latest Java 8 security updates)
  • 09:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T352010)', diff saved to https://phabricator.wikimedia.org/P56578 and previous config saved to /var/cache/conftool/dbconfig/20240209-093754-ladsgroup.json
  • 09:32 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2194.codfw.wmnet with reason: host reimage
  • 09:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2194.codfw.wmnet with reason: host reimage
  • 09:08 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2194.codfw.wmnet with OS bookworm
  • 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts puppetmaster2003.codfw.wmnet
  • 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:37 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 08:29 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts puppetmaster2003.codfw.wmnet
  • 06:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1185 (T352010)', diff saved to https://phabricator.wikimedia.org/P56577 and previous config saved to /var/cache/conftool/dbconfig/20240209-065147-ladsgroup.json
  • 06:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 06:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 06:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T352010)', diff saved to https://phabricator.wikimedia.org/P56576 and previous config saved to /var/cache/conftool/dbconfig/20240209-065125-ladsgroup.json
  • 06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1124.eqiad.wmnet
  • 06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1124.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 06:36 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1124.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 06:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P56575 and previous config saved to /var/cache/conftool/dbconfig/20240209-063618-ladsgroup.json
  • 06:34 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 06:29 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1124.eqiad.wmnet
  • 06:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P56574 and previous config saved to /var/cache/conftool/dbconfig/20240209-062111-ladsgroup.json
  • 06:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T352010)', diff saved to https://phabricator.wikimedia.org/P56573 and previous config saved to /var/cache/conftool/dbconfig/20240209-060605-ladsgroup.json
  • 05:48 marostegui: dbmaint Schema change on s7@codfw T357067
  • 04:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 04:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 03:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T352010)', diff saved to https://phabricator.wikimedia.org/P56572 and previous config saved to /var/cache/conftool/dbconfig/20240209-030028-ladsgroup.json
  • 03:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 02:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 02:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 02:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 00:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance

2024-02-08

  • 23:57 volans@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1001.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 23:56 volans@cumin1002: START - Cookbook sre.hosts.provision for host sretest1001.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 23:50 foks: removing 14 files for legal compliance
  • 23:28 foks: removing one file for legal compliance
  • 23:17 foks: removing two files for legal compliance
  • 22:58 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in cloudelastic
  • 22:57 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in cloudelastic
  • 22:51 jhathaway: made a stupid mistake and accidentally installed knot & unbound on dns1004, based on logs I don't think any harm was caused, they have since been removed
  • 22:44 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:44 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: racked and provision network restbase servers - jclark@cumin1002"
  • 22:43 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: racked and provision network restbase servers - jclark@cumin1002"
  • 22:41 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 22:38 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic1005*,cloudelastic1006*,cloudelastic1007*,cloudelastic1008* for IP migration - bking@cumin2002 - T355617
  • 22:38 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1005*,cloudelastic1006*,cloudelastic1007*,cloudelastic1008* for IP migration - bking@cumin2002 - T355617
  • 22:26 vriley@cumin1001: START - Cookbook sre.hosts.provision for host restbase1035.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:24 vriley@cumin1001: START - Cookbook sre.hosts.provision for host restbase1034.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:21 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:21 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: racked and provision network restbase servers - jclark@cumin1002"
  • 22:20 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: racked and provision network restbase servers - jclark@cumin1002"
  • 22:18 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 22:11 topranks: adding missing external-links group to AMS-IX peering port ae1.380 cr1-esams
  • 22:06 urbanecm@deploy2002: Finished scap: Backport for Echo: Conditional defaults: Fix start timestamp (T353225) (duration: 09m 29s)
  • 22:00 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 21:59 Daimona: T357007 Running mwscript /home/daimona/GenerateInvitationList.php --wiki=metawiki --listfile=/home/daimona/list2.txt (same as current master)
  • 21:58 urbanecm@deploy2002: urbanecm: Backport for Echo: Conditional defaults: Fix start timestamp (T353225) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:57 urbanecm@deploy2002: Started scap: Backport for Echo: Conditional defaults: Fix start timestamp (T353225)
  • 21:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1160 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P56571 and previous config saved to /var/cache/conftool/dbconfig/20240208-214640-root.json
  • 21:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2140 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P56570 and previous config saved to /var/cache/conftool/dbconfig/20240208-214625-root.json
  • 21:46 urbanecm@deploy2002: Finished scap: Backport for Echo: Use conditional defaults for 4 user properties (T353225) (duration: 09m 07s)
  • 21:40 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 21:38 urbanecm@deploy2002: urbanecm: Backport for Echo: Use conditional defaults for 4 user properties (T353225) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:37 urbanecm@deploy2002: Started scap: Backport for Echo: Use conditional defaults for 4 user properties (T353225)
  • 21:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1160 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P56569 and previous config saved to /var/cache/conftool/dbconfig/20240208-213135-root.json
  • 21:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2140 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P56568 and previous config saved to /var/cache/conftool/dbconfig/20240208-213120-root.json
  • 21:25 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.17 refs T354435
  • 21:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1160 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P56567 and previous config saved to /var/cache/conftool/dbconfig/20240208-211630-root.json
  • 21:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2140 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P56566 and previous config saved to /var/cache/conftool/dbconfig/20240208-211615-root.json
  • 21:13 brennen@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.17 refs T354435 (duration: 06m 52s)
  • 21:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 21:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 21:06 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.17 refs T354435
  • 21:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 21:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 21:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1160 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P56565 and previous config saved to /var/cache/conftool/dbconfig/20240208-210125-root.json
  • 21:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2140 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P56564 and previous config saved to /var/cache/conftool/dbconfig/20240208-210110-root.json
  • 20:55 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:55 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for sretest2003 - cmooney@cumin1002"
  • 20:55 brennen@deploy2002: Finished scap: Backport for Revert "Migrate `editResponseTime` metric to Prometheus store" (T357050) (duration: 09m 17s)
  • 20:54 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for sretest2003 - cmooney@cumin1002"
  • 20:48 brennen@deploy2002: brennen: Continuing with sync
  • 20:47 brennen@deploy2002: brennen: Backport for Revert "Migrate `editResponseTime` metric to Prometheus store" (T357050) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:45 brennen@deploy2002: Started scap: Backport for Revert "Migrate `editResponseTime` metric to Prometheus store" (T357050)
  • 20:39 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 20:24 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:24 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for sretest2003 - cmooney@cumin1002"
  • 20:24 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for sretest2003 - cmooney@cumin1002"
  • 20:18 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 20:17 cmooney@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 20:16 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 20:16 cmooney@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 20:07 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 19:01 brennen: train 1.42.0-wmf.17 (T354435): currently rolled back to group0; blocked pending a fix for edit metrics (further details to come)
  • 18:58 ejegg: re-enabled fundraising scheduled jobs
  • 18:49 ejegg: standalone SmashPig upgraded from 20d6434e to 669a9fe3
  • 18:48 brennen@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.17 refs T354435
  • 18:41 ejegg: jobs disabled for option change
  • 18:03 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
  • 18:02 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
  • 18:02 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
  • 18:01 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/toolhub: apply
  • 18:01 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
  • 18:00 bd808@deploy2002: helmfile [staging] START helmfile.d/services/toolhub: apply
  • 17:52 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P56563 and previous config saved to /var/cache/conftool/dbconfig/20240208-175206-root.json
  • 17:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P56562 and previous config saved to /var/cache/conftool/dbconfig/20240208-175149-root.json
  • 17:45 mutante: deploy1002/deploy2002 - change in scap foreachwikiindblist deployed (gerrit:992263)
  • 17:37 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P56561 and previous config saved to /var/cache/conftool/dbconfig/20240208-173701-root.json
  • 17:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P56560 and previous config saved to /var/cache/conftool/dbconfig/20240208-173644-root.json
  • 17:29 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P56559 and previous config saved to /var/cache/conftool/dbconfig/20240208-172902-root.json
  • 17:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P56558 and previous config saved to /var/cache/conftool/dbconfig/20240208-172156-root.json
  • 17:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P56557 and previous config saved to /var/cache/conftool/dbconfig/20240208-172139-root.json
  • 17:15 brennen@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.17 refs T354435 (duration: 06m 52s)
  • 17:13 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P56556 and previous config saved to /var/cache/conftool/dbconfig/20240208-171358-root.json
  • 17:09 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.17 refs T354435
  • 17:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P56555 and previous config saved to /var/cache/conftool/dbconfig/20240208-170651-root.json
  • 17:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P56554 and previous config saved to /var/cache/conftool/dbconfig/20240208-170634-root.json
  • 17:01 brennen: train 1.42.0-wmf.17 (T354435): blockers resolved, rolling to group1
  • 16:58 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P56553 and previous config saved to /var/cache/conftool/dbconfig/20240208-165853-root.json
  • 16:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P56552 and previous config saved to /var/cache/conftool/dbconfig/20240208-165147-root.json
  • 16:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 16:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P56551 and previous config saved to /var/cache/conftool/dbconfig/20240208-165129-root.json
  • 16:48 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=(mw2379|mw2380|mw2382|mw2383|mw2384|mw2385|mw2386|mw2387|mw2388|mw2389|mw2390|mw2391|mw2392|mw2393|mw2394|mw2396|mw2397|mw2398|mw2399|mw2400|mw2298|mw2299|mw2300).*
  • 16:48 claime: Repooling mw2379|mw2380|mw2382|mw2383|mw2384|mw2385|mw2386|mw2387|mw2388|mw2389|mw2390|mw2391|mw2392|mw2393|mw2394|mw2396|mw2397|mw2398|mw2399|mw2400|mw2298|mw2299|mw2300 - T355862
  • 16:43 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P56550 and previous config saved to /var/cache/conftool/dbconfig/20240208-164348-root.json
  • 16:40 claime: Uncordoning mw2377.codfw.wmnet mw2378.codfw.wmnet mw2381.codfw.wmnet mw2395.codfw.wmnet mw2291.codfw.wmnet mw2292.codfw.wmnet mw2293.codfw.wmnet mw2294.codfw.wmnet mw2295.codfw.wmnet mw2296.codfw.wmnet mw2297.codfw.wmnet - T355862
  • 16:37 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for asw-a-codfw,cr[1-2]-codfw,lsw1-a3-codfw.mgmt
  • 16:37 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for asw-a-codfw,cr[1-2]-codfw,lsw1-a3-codfw.mgmt
  • 16:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 5%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P56549 and previous config saved to /var/cache/conftool/dbconfig/20240208-163642-root.json
  • 16:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 5%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P56548 and previous config saved to /var/cache/conftool/dbconfig/20240208-163624-root.json
  • 16:31 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 16:28 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P56547 and previous config saved to /var/cache/conftool/dbconfig/20240208-162843-root.json
  • 16:26 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 16:23 topranks: Server move completed codfw rack A3 T355862
  • 16:15 Dreamy_Jazz: Running `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt` on a tmux session - See https://wikitech.wikimedia.org/wiki/MediaModeration
  • 16:13 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 5%: After schema change', diff saved to https://phabricator.wikimedia.org/P56546 and previous config saved to /var/cache/conftool/dbconfig/20240208-161338-root.json
  • 16:10 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 39 hosts with reason: Migrating servers in codfw rack A3 to lsw1-a3-codfw
  • 16:10 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 16:09 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 39 hosts with reason: Migrating servers in codfw rack A3 to lsw1-a3-codfw
  • 16:09 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a3-codfw.mgmt with reason: server uplink migration codfw rack a3
  • 16:09 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a3-codfw.mgmt with reason: server uplink migration codfw rack a3
  • 16:07 topranks: Commencing server uplink moves from old switch to new in codfw rack A3 T355862
  • 16:05 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 16:04 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 16:04 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 16:03 moritzm: installing pillow security updates
  • 16:03 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 16:03 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 15:58 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 1%: After schema change', diff saved to https://phabricator.wikimedia.org/P56545 and previous config saved to /var/cache/conftool/dbconfig/20240208-155833-root.json
  • 15:57 topranks: moving Netbox server uplinks from asw-a3-codfw to lsw1-a3-codfw to prep config for server moves T355862
  • 15:57 Dreamy_Jazz: Running `foreachwikindblist group0.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-group0-sleep-30-thumbor.txt`
  • 15:57 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wmcs::openstack::codfw1dev::cloudweb
  • 15:56 claime: Depooled mw2379|mw2380|mw2382|mw2383|mw2384|mw2385|mw2386|mw2387|mw2388|mw2389|mw2390|mw2391|mw2392|mw2393|mw2394|mw2396|mw2397|mw2398|mw2399|mw2400|mw2298|mw2299|mw2300 - T355862
  • 15:55 Dreamy_Jazz: Running `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=testwiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-testwiki-sleep-30-no-render-now.txt`
  • 15:55 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=(mw2379|mw2380|mw2382|mw2383|mw2384|mw2385|mw2386|mw2387|mw2388|mw2389|mw2390|mw2391|mw2392|mw2393|mw2394|mw2396|mw2397|mw2398|mw2399|mw2400|mw2298|mw2299|mw2300).*
  • 15:54 dreamyjazz@deploy2002: Finished scap: Backport for Follow-up: MediaModerationImageContentsLookup: use proxied HTTP request to generate file (T356047), Follow-up: MediaModerationImageContentsLookup: use proxied HTTP request to generate file (T356047) (duration: 08m 03s)
  • 15:50 taavi@cumin1002: START - Cookbook sre.puppet.migrate-role for role: wmcs::openstack::codfw1dev::cloudweb
  • 15:49 claime: Draining mw2377.codfw.wmnet mw2378.codfw.wmnet mw2381.codfw.wmnet mw2395.codfw.wmnet mw2291.codfw.wmnet mw2292.codfw.wmnet mw2293.codfw.wmnet mw2294.codfw.wmnet mw2295.codfw.wmnet mw2296.codfw.wmnet mw2297.codfw.wmnet - T355862
  • 15:48 claime: Draining mw2377.codfw.wmnet mw2378.codfw.wmnet mw2381.codfw.wmnet mw2395.codfw.wmnet mw2291.codfw.wmnet mw2292.codfw.wmnet mw2293.codfw.wmnet mw2294.codfw.wmnet mw2295.codfw.wmnet mw2296.codfw.wmnet mw2297.codfw.wmnet - T355870
  • 15:47 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 15:47 dreamyjazz@deploy2002: dreamyjazz: Backport for Follow-up: MediaModerationImageContentsLookup: use proxied HTTP request to generate file (T356047), Follow-up: MediaModerationImageContentsLookup: use proxied HTTP request to generate file (T356047) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:47 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 15:47 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 15:47 Dreamy_Jazz: Stopped mediamoderation scanning script
  • 15:46 dreamyjazz@deploy2002: Started scap: Backport for Follow-up: MediaModerationImageContentsLookup: use proxied HTTP request to generate file (T356047), Follow-up: MediaModerationImageContentsLookup: use proxied HTTP request to generate file (T356047)
  • 15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T355609)', diff saved to https://phabricator.wikimedia.org/P56544 and previous config saved to /var/cache/conftool/dbconfig/20240208-154452-marostegui.json
  • 15:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 15:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 15:41 marostegui: dbmaint Schema change on s3@codfw T356988
  • 15:39 marostegui: dbmaint Schema change on s4@codfw T356988
  • 15:38 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 15:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T355609)', diff saved to https://phabricator.wikimedia.org/P56543 and previous config saved to /var/cache/conftool/dbconfig/20240208-152511-marostegui.json
  • 15:20 Dreamy_Jazz: Afternoon backport window done
  • 15:17 Dreamy_Jazz: Running `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt` on a tmux session - See https://wikitech.wikimedia.org/wiki/MediaModeration
  • 15:17 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 15:17 dreamyjazz@deploy2002: Finished scap: Backport for MediaModerationImageContentsLookup: use proxied HTTP request to generate file (T356047) (duration: 08m 42s)
  • 15:13 marostegui: dbmaint Schema change on s5@codfw T356988
  • 15:10 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 15:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P56542 and previous config saved to /var/cache/conftool/dbconfig/20240208-151005-marostegui.json
  • 15:09 dreamyjazz@deploy2002: dreamyjazz: Backport for MediaModerationImageContentsLookup: use proxied HTTP request to generate file (T356047) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:08 marostegui: dbmaint Schema change on s7@codfw T356988
  • 15:08 dreamyjazz@deploy2002: Started scap: Backport for MediaModerationImageContentsLookup: use proxied HTTP request to generate file (T356047)
  • 15:05 marostegui: dbmaint (retroactive logging) Schema change on s7@codfw T356987
  • 15:05 Dreamy_Jazz: Stopped mediamoderation scanning script for commonswiki
  • 15:04 Dreamy_Jazz: testwiki scan finished
  • 15:03 marostegui: dbmaint Schema change on s8@codfw T356988
  • 15:03 marostegui: dbmaint Schema change on s6@codfw T356988
  • 15:03 marostegui: dbmaint Schema change on s2@codfw T356988
  • 15:03 marostegui: dbmaint Schema change on s1@codfw T356988
  • 14:55 Dreamy_Jazz: Running `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=testwiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-testwiki-sleep-30-no-render-now.txt`
  • 14:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P56541 and previous config saved to /var/cache/conftool/dbconfig/20240208-145457-marostegui.json
  • 14:54 dreamyjazz@deploy2002: Finished scap: Backport for MediaModerationImageContentsLookup: use proxied HTTP request to generate file (T356047) (duration: 07m 49s)
  • 14:48 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 14:48 dreamyjazz@deploy2002: dreamyjazz: Backport for MediaModerationImageContentsLookup: use proxied HTTP request to generate file (T356047) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:46 dreamyjazz@deploy2002: Started scap: Backport for MediaModerationImageContentsLookup: use proxied HTTP request to generate file (T356047)
  • 14:46 dreamyjazz@deploy2002: Finished scap: Backport for Add edit_interaction stream config for iOS (T355265) (duration: 10m 12s)
  • 14:40 dreamyjazz@deploy2002: tsev and dreamyjazz: Continuing with sync
  • 14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T355609)', diff saved to https://phabricator.wikimedia.org/P56540 and previous config saved to /var/cache/conftool/dbconfig/20240208-143951-marostegui.json
  • 14:37 dreamyjazz@deploy2002: tsev and dreamyjazz: Backport for Add edit_interaction stream config for iOS (T355265) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:35 dreamyjazz@deploy2002: Started scap: Backport for Add edit_interaction stream config for iOS (T355265)
  • 14:35 dreamyjazz@deploy2002: Finished scap: Backport for Parser: Fix the main loop getting stuck on some signatures (T356884) (duration: 08m 29s)
  • 14:29 dreamyjazz@deploy2002: dreamyjazz and matmarex: Continuing with sync
  • 14:28 eoghan@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM vrts1002.eqiad.wmnet
  • 14:28 dreamyjazz@deploy2002: dreamyjazz and matmarex: Backport for Parser: Fix the main loop getting stuck on some signatures (T356884) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:26 dreamyjazz@deploy2002: Started scap: Backport for Parser: Fix the main loop getting stuck on some signatures (T356884)
  • 14:26 dreamyjazz@deploy2002: Finished scap: Backport for Parser: Fix the main loop getting stuck on some signatures (T356884) (duration: 09m 36s)
  • 14:19 dreamyjazz@deploy2002: dreamyjazz and matmarex: Continuing with sync
  • 14:19 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudweb2002-dev.wikimedia.org with OS bullseye
  • 14:18 dreamyjazz@deploy2002: dreamyjazz and matmarex: Backport for Parser: Fix the main loop getting stuck on some signatures (T356884) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:16 dreamyjazz@deploy2002: Started scap: Backport for Parser: Fix the main loop getting stuck on some signatures (T356884)
  • 14:13 eoghan@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM vrts1002.eqiad.wmnet
  • 14:13 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:07 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:07 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 13:58 Emperor: disable puppet and stop swift on ms-be10[44-50] T353149
  • 13:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8 days, 0:00:00 on 7 hosts with reason: due for decomm
  • 13:57 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 8 days, 0:00:00 on 7 hosts with reason: due for decomm
  • 13:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T355609)', diff saved to https://phabricator.wikimedia.org/P56539 and previous config saved to /var/cache/conftool/dbconfig/20240208-135142-marostegui.json
  • 13:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 13:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 13:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 13:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 13:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 13:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 13:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T355609)', diff saved to https://phabricator.wikimedia.org/P56538 and previous config saved to /var/cache/conftool/dbconfig/20240208-134243-marostegui.json
  • 13:37 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudweb2002-dev.wikimedia.org with reason: host reimage
  • 13:35 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudweb2002-dev.wikimedia.org with reason: host reimage
  • 13:31 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-codfw
  • 13:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P56537 and previous config saved to /var/cache/conftool/dbconfig/20240208-132736-marostegui.json
  • 13:24 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-codfw
  • 13:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 13:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 13:13 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudweb2002-dev.wikimedia.org with OS bullseye
  • 13:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P56536 and previous config saved to /var/cache/conftool/dbconfig/20240208-131229-marostegui.json
  • 12:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T355609)', diff saved to https://phabricator.wikimedia.org/P56535 and previous config saved to /var/cache/conftool/dbconfig/20240208-125723-marostegui.json
  • 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1003.eqiad.wmnet
  • 12:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1003.eqiad.wmnet
  • 12:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168:3317 (T355609)', diff saved to https://phabricator.wikimedia.org/P56534 and previous config saved to /var/cache/conftool/dbconfig/20240208-123343-marostegui.json
  • 12:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 12:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 12:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T355609)', diff saved to https://phabricator.wikimedia.org/P56533 and previous config saved to /var/cache/conftool/dbconfig/20240208-123320-marostegui.json
  • 12:21 jgiannelos@deploy2002: Finished deploy [restbase/deploy@6a64b3d]: restbase: Disable parsoid storage for jawiki (duration: 15m 49s)
  • 12:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P56532 and previous config saved to /var/cache/conftool/dbconfig/20240208-121813-marostegui.json
  • 12:05 jgiannelos@deploy2002: Started deploy [restbase/deploy@6a64b3d]: restbase: Disable parsoid storage for jawiki
  • 12:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P56531 and previous config saved to /var/cache/conftool/dbconfig/20240208-120306-marostegui.json
  • 12:01 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 12:01 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 11:58 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 11:58 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T355609)', diff saved to https://phabricator.wikimedia.org/P56530 and previous config saved to /var/cache/conftool/dbconfig/20240208-114759-marostegui.json
  • 11:41 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 11:41 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T355609)', diff saved to https://phabricator.wikimedia.org/P56529 and previous config saved to /var/cache/conftool/dbconfig/20240208-113707-marostegui.json
  • 11:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 11:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 11:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T355609)', diff saved to https://phabricator.wikimedia.org/P56528 and previous config saved to /var/cache/conftool/dbconfig/20240208-113630-marostegui.json
  • 11:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P56527 and previous config saved to /var/cache/conftool/dbconfig/20240208-112123-marostegui.json
  • 11:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P56526 and previous config saved to /var/cache/conftool/dbconfig/20240208-110616-marostegui.json
  • 10:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T355609)', diff saved to https://phabricator.wikimedia.org/P56525 and previous config saved to /var/cache/conftool/dbconfig/20240208-105110-marostegui.json
  • 10:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 10:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 10:41 hnowlan@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1019*,lvs2013*} and A:lvs (T334488)
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T355609)', diff saved to https://phabricator.wikimedia.org/P56524 and previous config saved to /var/cache/conftool/dbconfig/20240208-104011-marostegui.json
  • 10:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 10:40 hnowlan@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1019*,lvs2013*} and A:lvs (T334488)
  • 10:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 10:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T355609)', diff saved to https://phabricator.wikimedia.org/P56523 and previous config saved to /var/cache/conftool/dbconfig/20240208-103949-marostegui.json
  • 10:39 hnowlan@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1020*,lvs2014*} and A:lvs (T334488)
  • 10:38 hnowlan@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1020*,lvs2014*} and A:lvs (T334488)
  • 10:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P56522 and previous config saved to /var/cache/conftool/dbconfig/20240208-102442-marostegui.json
  • 10:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P56521 and previous config saved to /var/cache/conftool/dbconfig/20240208-100936-marostegui.json
  • 10:07 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 10:06 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 10:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 10:05 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 10:04 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 10:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 10:03 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 10:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 10:01 jiji@cumin1002: conftool action : set/pooled=inactive; selector: service=kubesvc,name=mw2282.codfw.wmnet
  • 10:01 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 10:00 jiji@cumin1002: conftool action : set/pooled=no; selector: service=kubesvc,name=mw2282.codfw.wmnet
  • 09:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T355609)', diff saved to https://phabricator.wikimedia.org/P56520 and previous config saved to /var/cache/conftool/dbconfig/20240208-095429-marostegui.json
  • 09:36 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 09:34 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 09:21 vgutierrez@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and not P{ncredir2.*} and A:ncredir
  • 09:08 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb1003.wikimedia.org
  • 09:01 taavi@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudweb1003.wikimedia.org
  • 08:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T355609)', diff saved to https://phabricator.wikimedia.org/P56518 and previous config saved to /var/cache/conftool/dbconfig/20240208-085357-marostegui.json
  • 08:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 08:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 08:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T355609)', diff saved to https://phabricator.wikimedia.org/P56517 and previous config saved to /var/cache/conftool/dbconfig/20240208-085334-marostegui.json
  • 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P56516 and previous config saved to /var/cache/conftool/dbconfig/20240208-083827-marostegui.json
  • 08:37 urbanecm@deploy2002: Finished scap: Backport for Use real anonymous user in ComputedUserImpactLookup (T356895) (duration: 07m 49s)
  • 08:29 urbanecm@deploy2002: Started scap: Backport for Use real anonymous user in ComputedUserImpactLookup (T356895)
  • 08:25 marostegui@cumin1002: dbctl commit (dc=all): 'es2032 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P56515 and previous config saved to /var/cache/conftool/dbconfig/20240208-082544-root.json
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P56514 and previous config saved to /var/cache/conftool/dbconfig/20240208-082320-marostegui.json
  • 08:19 vgutierrez@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and not P{ncredir2.*} and A:ncredir
  • 08:17 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'es2032 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P56513 and previous config saved to /var/cache/conftool/dbconfig/20240208-081039-root.json
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T355609)', diff saved to https://phabricator.wikimedia.org/P56512 and previous config saved to /var/cache/conftool/dbconfig/20240208-080814-marostegui.json
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T355609)', diff saved to https://phabricator.wikimedia.org/P56511 and previous config saved to /var/cache/conftool/dbconfig/20240208-075549-marostegui.json
  • 07:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'es2032 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P56510 and previous config saved to /var/cache/conftool/dbconfig/20240208-075534-root.json
  • 07:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T355609)', diff saved to https://phabricator.wikimedia.org/P56509 and previous config saved to /var/cache/conftool/dbconfig/20240208-075526-marostegui.json
  • 07:51 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 07:49 vgutierrez: reboot ncredir2002 to validate https://gerrit.wikimedia.org/r/c/operations/puppet/+/998438
  • 07:45 vgutierrez: repool ncredir2001
  • 07:44 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'es2032 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P56508 and previous config saved to /var/cache/conftool/dbconfig/20240208-074029-root.json
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P56507 and previous config saved to /var/cache/conftool/dbconfig/20240208-074019-marostegui.json
  • 07:39 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 07:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 07:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2140 as able to serve API', diff saved to https://phabricator.wikimedia.org/P56506 and previous config saved to /var/cache/conftool/dbconfig/20240208-072808-arnaudb.json
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'es2032 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P56505 and previous config saved to /var/cache/conftool/dbconfig/20240208-072523-root.json
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P56504 and previous config saved to /var/cache/conftool/dbconfig/20240208-072512-marostegui.json
  • 07:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2140 T355658', diff saved to https://phabricator.wikimedia.org/P56503 and previous config saved to /var/cache/conftool/dbconfig/20240208-071916-arnaudb.json
  • 07:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2179 to s4 primary and set section read-write T355658', diff saved to https://phabricator.wikimedia.org/P56502 and previous config saved to /var/cache/conftool/dbconfig/20240208-071559-arnaudb.json
  • 07:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Set s4 codfw as read-only for maintenance - T355658', diff saved to https://phabricator.wikimedia.org/P56501 and previous config saved to /var/cache/conftool/dbconfig/20240208-071414-arnaudb.json
  • 07:12 arnaudb: Starting s4 codfw failover from db2140 to db2179 - T355658
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'es2032 (re)pooling @ 5%: After reimage', diff saved to https://phabricator.wikimedia.org/P56500 and previous config saved to /var/cache/conftool/dbconfig/20240208-071018-root.json
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T355609)', diff saved to https://phabricator.wikimedia.org/P56499 and previous config saved to /var/cache/conftool/dbconfig/20240208-071006-marostegui.json
  • 06:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2120 (T355609)', diff saved to https://phabricator.wikimedia.org/P56498 and previous config saved to /var/cache/conftool/dbconfig/20240208-065742-marostegui.json
  • 06:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 06:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 06:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T355609)', diff saved to https://phabricator.wikimedia.org/P56497 and previous config saved to /var/cache/conftool/dbconfig/20240208-065720-marostegui.json
  • 06:56 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2032 back to es1 primary T351916', diff saved to https://phabricator.wikimedia.org/P56496 and previous config saved to /var/cache/conftool/dbconfig/20240208-065607-root.json
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'es2032 (re)pooling @ 1%: After reimage', diff saved to https://phabricator.wikimedia.org/P56495 and previous config saved to /var/cache/conftool/dbconfig/20240208-065513-root.json
  • 06:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2179 with weight 0 T355658', diff saved to https://phabricator.wikimedia.org/P56494 and previous config saved to /var/cache/conftool/dbconfig/20240208-064802-arnaudb.json
  • 06:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 38 hosts with reason: Primary switchover s4 T355658
  • 06:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 38 hosts with reason: Primary switchover s4 T355658
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P56493 and previous config saved to /var/cache/conftool/dbconfig/20240208-064213-marostegui.json
  • 06:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2032.codfw.wmnet with OS bookworm
  • 06:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P56492 and previous config saved to /var/cache/conftool/dbconfig/20240208-062706-marostegui.json
  • 06:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2032.codfw.wmnet with reason: host reimage
  • 06:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2032.codfw.wmnet with reason: host reimage
  • 06:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T355609)', diff saved to https://phabricator.wikimedia.org/P56491 and previous config saved to /var/cache/conftool/dbconfig/20240208-061200-marostegui.json
  • 06:03 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2032.codfw.wmnet with OS bookworm
  • 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2032 T351916', diff saved to https://phabricator.wikimedia.org/P56490 and previous config saved to /var/cache/conftool/dbconfig/20240208-060226-root.json
  • 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2020 to es1 primary T351916', diff saved to https://phabricator.wikimedia.org/P56489 and previous config saved to /var/cache/conftool/dbconfig/20240208-060204-root.json
  • 05:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2108 (T355609)', diff saved to https://phabricator.wikimedia.org/P56488 and previous config saved to /var/cache/conftool/dbconfig/20240208-055944-marostegui.json
  • 05:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 05:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2103 es2020 T355862', diff saved to https://phabricator.wikimedia.org/P56487 and previous config saved to /var/cache/conftool/dbconfig/20240208-055316-root.json
  • 05:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 05:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 05:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 05:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 02:41 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2198.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2196.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:35 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2197.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:20 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2198.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:18 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 02:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2198 to codfw - jhancock@cumin2002"
  • 02:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2198 to codfw - jhancock@cumin2002"
  • 02:15 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 02:13 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2197.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 02:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2197 to codfw - jhancock@cumin2002"
  • 02:12 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2197 to codfw - jhancock@cumin2002"
  • 02:10 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 01:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2196.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:57 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 01:57 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2196 to codfw - jhancock@cumin2002"
  • 01:57 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2196 to codfw - jhancock@cumin2002"
  • 01:54 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 00:43 ejegg: fundraising civicrm upgraded from 98d35c79 to c66b04bd

2024-02-07

  • 23:54 dzahn@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: security release
  • 22:49 brett: Uploaded ncmonitor 0.0.2 to bookworm-wikimedia archive
  • 22:46 ryankemper@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T347624, testing 961878 patch) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 22:46 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, testing 961878 patch) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 22:16 ebernhardson@deploy2002: Finished scap: Backport for cirrus: Re-enable writes to wikidata on cloudelastic (T352335) (duration: 09m 10s)
  • 22:10 ebernhardson@deploy2002: ebernhardson: Continuing with sync
  • 22:09 ebernhardson@deploy2002: ebernhardson: Backport for cirrus: Re-enable writes to wikidata on cloudelastic (T352335) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 22:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 22:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T355609)', diff saved to https://phabricator.wikimedia.org/P56485 and previous config saved to /var/cache/conftool/dbconfig/20240207-220824-marostegui.json
  • 22:07 ebernhardson@deploy2002: Started scap: Backport for cirrus: Re-enable writes to wikidata on cloudelastic (T352335)
  • 22:07 dzahn@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: security release
  • 21:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P56484 and previous config saved to /var/cache/conftool/dbconfig/20240207-215317-marostegui.json
  • 21:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P56483 and previous config saved to /var/cache/conftool/dbconfig/20240207-213810-marostegui.json
  • 21:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T355609)', diff saved to https://phabricator.wikimedia.org/P56482 and previous config saved to /var/cache/conftool/dbconfig/20240207-212304-marostegui.json
  • 21:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T355609)', diff saved to https://phabricator.wikimedia.org/P56481 and previous config saved to /var/cache/conftool/dbconfig/20240207-211803-marostegui.json
  • 21:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 21:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 21:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T355609)', diff saved to https://phabricator.wikimedia.org/P56480 and previous config saved to /var/cache/conftool/dbconfig/20240207-211741-marostegui.json
  • 21:09 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1008.eqiad.wmnet with OS bullseye
  • 21:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P56479 and previous config saved to /var/cache/conftool/dbconfig/20240207-210235-marostegui.json
  • 20:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P56478 and previous config saved to /var/cache/conftool/dbconfig/20240207-204728-marostegui.json
  • 20:43 brennen@deploy2002: Finished scap: Backport for Fix regression in HLS track content type (T356780) (duration: 10m 20s)
  • 20:37 brennen@deploy2002: brennen: Continuing with sync
  • 20:37 brennen@deploy2002: brennen: Backport for Fix regression in HLS track content type (T356780) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:33 brennen@deploy2002: Started scap: Backport for Fix regression in HLS track content type (T356780)
  • 20:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T355609)', diff saved to https://phabricator.wikimedia.org/P56477 and previous config saved to /var/cache/conftool/dbconfig/20240207-203222-marostegui.json
  • 20:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T355609)', diff saved to https://phabricator.wikimedia.org/P56475 and previous config saved to /var/cache/conftool/dbconfig/20240207-202123-marostegui.json
  • 20:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 20:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 20:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T355609)', diff saved to https://phabricator.wikimedia.org/P56474 and previous config saved to /var/cache/conftool/dbconfig/20240207-202101-marostegui.json
  • 20:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1008.eqiad.wmnet with OS bullseye
  • 20:09 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1008
  • 20:08 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1008
  • 20:07 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:07 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: migrate cloudelastic1008 to private IPs - bking@cumin2002"
  • 20:06 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: migrate cloudelastic1008 to private IPs - bking@cumin2002"
  • 20:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P56473 and previous config saved to /var/cache/conftool/dbconfig/20240207-200555-marostegui.json
  • 20:04 joal@deploy2002: Finished deploy [airflow-dags/analytics@ea0a3db]: Analytics Hotfix [airflow-dags/analytics@ea0a3db2] (duration: 00m 40s)
  • 20:03 joal@deploy2002: Started deploy [airflow-dags/analytics@ea0a3db]: Analytics Hotfix [airflow-dags/analytics@ea0a3db2]
  • 20:00 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 19:56 bking@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudelastic1008.wikimedia.org
  • 19:56 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:56 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1008.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 19:55 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1008.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 19:52 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 19:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P56472 and previous config saved to /var/cache/conftool/dbconfig/20240207-195047-marostegui.json
  • 19:47 joal@deploy2002: Finished deploy [analytics/refinery@80b329b] (hadoop-test): Analytics Hotfix - TEST [analytics/refinery@80b329b5] (duration: 03m 40s)
  • 19:45 bking@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic1008.wikimedia.org
  • 19:43 joal@deploy2002: Started deploy [analytics/refinery@80b329b] (hadoop-test): Analytics Hotfix - TEST [analytics/refinery@80b329b5]
  • 19:42 joal@deploy2002: Finished deploy [analytics/refinery@80b329b] (thin): Analytics Hotfix -THIN [analytics/refinery@80b329b5] (duration: 00m 05s)
  • 19:42 joal@deploy2002: Started deploy [analytics/refinery@80b329b] (thin): Analytics Hotfix -THIN [analytics/refinery@80b329b5]
  • 19:42 joal@deploy2002: Finished deploy [analytics/refinery@80b329b]: Analytics Hotfix [analytics/refinery@80b329b5] (duration: 10m 28s)
  • 19:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T355609)', diff saved to https://phabricator.wikimedia.org/P56471 and previous config saved to /var/cache/conftool/dbconfig/20240207-193540-marostegui.json
  • 19:32 joal@deploy2002: Started deploy [analytics/refinery@80b329b]: Analytics Hotfix [analytics/refinery@80b329b5]
  • 19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T355609)', diff saved to https://phabricator.wikimedia.org/P56470 and previous config saved to /var/cache/conftool/dbconfig/20240207-193016-marostegui.json
  • 19:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 19:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 19:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T355609)', diff saved to https://phabricator.wikimedia.org/P56469 and previous config saved to /var/cache/conftool/dbconfig/20240207-192953-marostegui.json
  • 19:19 mutante: people1004 systemctl stop confd; running puppet; checking to remove confd remnants from people* hosts - T356296
  • 19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P56468 and previous config saved to /var/cache/conftool/dbconfig/20240207-191446-marostegui.json
  • 19:01 brennen: train 1.42.0-wmf.17 (T354435): a couple of blockers currently, waiting on resolution before rolling
  • 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P56467 and previous config saved to /var/cache/conftool/dbconfig/20240207-185940-marostegui.json
  • 18:49 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
  • 18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T355609)', diff saved to https://phabricator.wikimedia.org/P56466 and previous config saved to /var/cache/conftool/dbconfig/20240207-184433-marostegui.json
  • 18:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T355609)', diff saved to https://phabricator.wikimedia.org/P56465 and previous config saved to /var/cache/conftool/dbconfig/20240207-183912-marostegui.json
  • 18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 18:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 18:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T355609)', diff saved to https://phabricator.wikimedia.org/P56464 and previous config saved to /var/cache/conftool/dbconfig/20240207-183849-marostegui.json
  • 18:30 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
  • 18:25 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
  • 18:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P56463 and previous config saved to /var/cache/conftool/dbconfig/20240207-182342-marostegui.json
  • 18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P56462 and previous config saved to /var/cache/conftool/dbconfig/20240207-180835-marostegui.json
  • 17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T355609)', diff saved to https://phabricator.wikimedia.org/P56461 and previous config saved to /var/cache/conftool/dbconfig/20240207-175328-marostegui.json
  • 17:52 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw
  • 17:52 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw
  • 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T355609)', diff saved to https://phabricator.wikimedia.org/P56460 and previous config saved to /var/cache/conftool/dbconfig/20240207-174807-marostegui.json
  • 17:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 17:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 17:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T355609)', diff saved to https://phabricator.wikimedia.org/P56459 and previous config saved to /var/cache/conftool/dbconfig/20240207-174745-marostegui.json
  • 17:32 jgiannelos@deploy2002: Finished deploy [restbase/deploy@1007273]: Disabling storage for jawiki (duration: 07m 19s)
  • 17:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P56458 and previous config saved to /var/cache/conftool/dbconfig/20240207-173238-marostegui.json
  • 17:26 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
  • 17:25 jgiannelos@deploy2002: Started deploy [restbase/deploy@1007273]: Disabling storage for jawiki
  • 17:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P56457 and previous config saved to /var/cache/conftool/dbconfig/20240207-171732-marostegui.json
  • 17:11 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: service=thumbor
  • 17:04 sbailey@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
  • 17:04 sbailey@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
  • 17:03 sbailey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
  • 17:03 sbailey@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
  • 17:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T355609)', diff saved to https://phabricator.wikimedia.org/P56456 and previous config saved to /var/cache/conftool/dbconfig/20240207-170225-marostegui.json
  • 16:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T355609)', diff saved to https://phabricator.wikimedia.org/P56455 and previous config saved to /var/cache/conftool/dbconfig/20240207-165703-marostegui.json
  • 16:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 16:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 16:55 sbailey@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 16:54 sbailey@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 16:52 hnowlan@cumin2002: conftool action : set/pooled=yes; selector: name=(mw2377.codfw.wmnet|mw2378.codfw.wmnet|mw2406.codfw.wmnet|mw2301.codfw.wmnet|mw2310.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 16:52 hnowlan@cumin2002: conftool action : set/weight=10; selector: name=(mw2377.codfw.wmnet|mw2378.codfw.wmnet|mw2406.codfw.wmnet|mw2301.codfw.wmnet|mw2310.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 16:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 16:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 16:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T355609)', diff saved to https://phabricator.wikimedia.org/P56454 and previous config saved to /var/cache/conftool/dbconfig/20240207-164738-marostegui.json
  • 16:47 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for asw-a-codfw,cr[1-2]-codfw,lsw1-a2-codfw.mgmt
  • 16:47 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for asw-a-codfw,cr[1-2]-codfw,lsw1-a2-codfw.mgmt
  • 16:47 ejegg: fundraising civicrm upgraded from c3dff157 to 98d35c79
  • 16:46 hnowlan: homer 'cr*codfw*' commit 'T354791' for 5 new k8s ex-appservers
  • 16:39 btullis@cumin1002: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 16:35 sbailey@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
  • 16:34 sbailey@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
  • 16:33 sbailey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
  • 16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P56452 and previous config saved to /var/cache/conftool/dbconfig/20240207-163231-marostegui.json
  • 16:32 sbailey@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
  • 16:25 sbailey@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
  • 16:24 sbailey@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
  • 16:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P56451 and previous config saved to /var/cache/conftool/dbconfig/20240207-161725-marostegui.json
  • 16:17 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-cache2001.codfw.wmnet
  • 16:16 klausman@cumin2002: START - Cookbook sre.hosts.remove-downtime for ml-cache2001.codfw.wmnet
  • 16:16 Emperor: repool codfw dnsdisc T355861
  • 16:16 mvernon@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=swift,name=codfw
  • 16:16 Emperor: repool thanos-fe2001 T355861
  • 16:15 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki ukwiki --current --all --touched-after=20230613000000 --start '["1685316"]' | tee ~/T315510-ukwiki # in tmux
  • 16:10 herron: hard reboot titan1002
  • 16:07 btullis@cumin1002: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 16:05 jelto: import etherpad-lite 1.9.7-1 on apt1001 host - T316421
  • 16:04 vgutierrez: <topranks> Commencing server uplink moves from old switch to new in codfw rack A2 T355861
  • 16:03 Lucas_WMDE: STOP persistRevisionThreadItems on rowiki for T315510 – according to T315510#9328399, it should be done already (it was at --start '["2075226"]' and had processed 31000, updated 0) [relog from 15:45, stashbot was down]
  • 15:42 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2377.codfw.wmnet with OS bullseye
  • 15:40 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["67578461"]' | tee ~/T315510-enwiki # in tmux
  • 15:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P56447 and previous config saved to /var/cache/conftool/dbconfig/20240207-153656-marostegui.json
  • 15:34 Lucas_WMDE: backport+config window done
  • 15:33 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2378.codfw.wmnet with reason: host reimage
  • 15:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for WebVideoTranscodeJob: also add time limits (T356780) (duration: 07m 48s)
  • 15:31 Lucas_WMDE: STOP persistRevisionThreadItems on frwiki for T315510 – 100% CPU usage, 15G RAM and counting, no progress output: clearly stuck on something
  • 15:30 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2406.codfw.wmnet with reason: host reimage
  • 15:28 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2301.codfw.wmnet with reason: host reimage
  • 15:26 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and oblivian: Continuing with sync
  • 15:26 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and oblivian: Backport for WebVideoTranscodeJob: also add time limits (T356780) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:25 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2310.codfw.wmnet with reason: host reimage
  • 15:24 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for WebVideoTranscodeJob: also add time limits (T356780)
  • 15:22 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2377.codfw.wmnet with reason: host reimage
  • 15:21 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2310.codfw.wmnet with reason: host reimage
  • 15:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P56446 and previous config saved to /var/cache/conftool/dbconfig/20240207-152150-marostegui.json
  • 15:21 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2301.codfw.wmnet with reason: host reimage
  • 15:21 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2406.codfw.wmnet with reason: host reimage
  • 15:20 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2378.codfw.wmnet with reason: host reimage
  • 15:20 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2377.codfw.wmnet with reason: host reimage
  • 15:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-be[2044-2050].codfw.wmnet
  • 15:13 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:13 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-be[2044-2050].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
  • 15:12 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki rowiki --current --all --touched-after=20230613000000 --start '["2041962"]' | tee ~/T315510-rowiki # in tmux
  • 15:10 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-be[2044-2050].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
  • 15:07 mvernon@cumin2002: START - Cookbook sre.dns.netbox
  • 15:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T355609)', diff saved to https://phabricator.wikimedia.org/P56445 and previous config saved to /var/cache/conftool/dbconfig/20240207-150643-marostegui.json
  • 15:05 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki frwiki --current --all --touched-after=20230613000000 --start '["7544396"]' | tee ~/T315510-frwiki # in tmux
  • 15:05 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2310.codfw.wmnet with OS bullseye
  • 15:05 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2301.codfw.wmnet with OS bullseye
  • 15:05 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2406.codfw.wmnet with OS bullseye
  • 15:04 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2378.codfw.wmnet with OS bullseye
  • 15:04 Lucas_WMDE: STOP script for T315510, forgot to tee it somewhere useful
  • 15:04 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2377.codfw.wmnet with OS bullseye
  • 15:02 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki frwiki --current --all --touched-after=20230613000000 --start '["7544396"]' # T315510, in tmux
  • 15:01 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-role (exit_code=99) for role: elasticsearch::cirrus
  • 15:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T355609)', diff saved to https://phabricator.wikimedia.org/P56444 and previous config saved to /var/cache/conftool/dbconfig/20240207-150121-marostegui.json
  • 15:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 15:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 14:58 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for ParserObserver: Limit the size of cache of previous parse traces (T351732), ParserObserver: Limit the size of cache of previous parse traces (T351732) (duration: 08m 08s)
  • 14:57 vgutierrez: reboot ncredir2001
  • 14:52 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and matmarex: Continuing with sync
  • 14:52 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and matmarex: Backport for ParserObserver: Limit the size of cache of previous parse traces (T351732), ParserObserver: Limit the size of cache of previous parse traces (T351732) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:50 vgutierrez: reboot ncredir2001
  • 14:50 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for ParserObserver: Limit the size of cache of previous parse traces (T351732), ParserObserver: Limit the size of cache of previous parse traces (T351732)
  • 14:48 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 100%: kernel upgrade done', diff saved to https://phabricator.wikimedia.org/P56443 and previous config saved to /var/cache/conftool/dbconfig/20240207-144822-arnaudb.json
  • 14:44 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ml-cache2001.codfw.wmnet with reason: Machine network link move (T355861)
  • 14:44 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on ml-cache2001.codfw.wmnet with reason: Machine network link move (T355861)
  • 14:40 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-be[2044-2050].codfw.wmnet
  • 14:33 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 75%: kernel upgrade done', diff saved to https://phabricator.wikimedia.org/P56442 and previous config saved to /var/cache/conftool/dbconfig/20240207-143317-arnaudb.json
  • 14:33 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: elasticsearch::cirrus
  • 14:32 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2045.codfw.wmnet
  • 14:32 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2045.codfw.wmnet
  • 14:32 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:29 volans: deploying debmonitor-client_0.3.5 fleet-wide
  • 14:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: kernel upgrade done', diff saved to https://phabricator.wikimedia.org/P56441 and previous config saved to /var/cache/conftool/dbconfig/20240207-142423-arnaudb.json
  • 14:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 14:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 14:18 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 60%: kernel upgrade done', diff saved to https://phabricator.wikimedia.org/P56440 and previous config saved to /var/cache/conftool/dbconfig/20240207-141812-arnaudb.json
  • 14:17 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Fix PermissionException being logged (T356223), Fix PermissionException being logged (T356223) (duration: 08m 08s)
  • 14:11 logmsgbot: lucaswerkmeister-wmde@deploy2002 jforrester and lucaswerkmeister-wmde: Continuing with sync
  • 14:11 logmsgbot: lucaswerkmeister-wmde@deploy2002 jforrester and lucaswerkmeister-wmde: Backport for Fix PermissionException being logged (T356223), Fix PermissionException being logged (T356223) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Fix PermissionException being logged (T356223), Fix PermissionException being logged (T356223)
  • 14:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: kernel upgrade done', diff saved to https://phabricator.wikimedia.org/P56439 and previous config saved to /var/cache/conftool/dbconfig/20240207-140918-arnaudb.json
  • 14:03 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 30%: kernel upgrade done', diff saved to https://phabricator.wikimedia.org/P56438 and previous config saved to /var/cache/conftool/dbconfig/20240207-140306-arnaudb.json
  • 13:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 60%: kernel upgrade done', diff saved to https://phabricator.wikimedia.org/P56437 and previous config saved to /var/cache/conftool/dbconfig/20240207-135412-arnaudb.json
  • 13:54 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:53 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 13:53 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 13:52 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 13:52 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:52 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 13:48 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 15%: kernel upgrade done', diff saved to https://phabricator.wikimedia.org/P56436 and previous config saved to /var/cache/conftool/dbconfig/20240207-134801-arnaudb.json
  • 13:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 30%: kernel upgrade done', diff saved to https://phabricator.wikimedia.org/P56435 and previous config saved to /var/cache/conftool/dbconfig/20240207-133907-arnaudb.json
  • 13:32 jmm@cumin2002: END (FAIL) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=1) rolling reboot on A:ncredir
  • 13:26 arnaudb@cumin1002: dbctl commit (dc=all): 'T344589 - depool es2024', diff saved to https://phabricator.wikimedia.org/P56434 and previous config saved to /var/cache/conftool/dbconfig/20240207-132559-arnaudb.json
  • 13:25 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on es2024.codfw.wmnet with reason: T344589 - kernel upgrade
  • 13:25 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on es2024.codfw.wmnet with reason: T344589 - kernel upgrade
  • 13:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 15%: kernel upgrade done', diff saved to https://phabricator.wikimedia.org/P56433 and previous config saved to /var/cache/conftool/dbconfig/20240207-132402-arnaudb.json
  • 12:46 arnaudb@cumin1002: dbctl commit (dc=all): 'T344589 - depool db2105', diff saved to https://phabricator.wikimedia.org/P56432 and previous config saved to /var/cache/conftool/dbconfig/20240207-124605-arnaudb.json
  • 12:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2105.codfw.wmnet with reason: T344589 - kernel upgrade
  • 12:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2105.codfw.wmnet with reason: T344589 - kernel upgrade
  • 12:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T355609)', diff saved to https://phabricator.wikimedia.org/P56431 and previous config saved to /var/cache/conftool/dbconfig/20240207-124409-marostegui.json
  • 12:35 hnowlan@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1019*,lvs2013*} and A:lvs (T334488)
  • 12:34 hnowlan@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1019*,lvs2013*} and A:lvs (T334488)
  • 12:33 jmm@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir
  • 12:32 hnowlan@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1020*,lvs2014*} and A:lvs (T334488)
  • 12:31 hnowlan@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1020*,lvs2014*} and A:lvs (T334488)
  • 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1016.eqiad.wmnet
  • 12:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P56430 and previous config saved to /var/cache/conftool/dbconfig/20240207-122903-marostegui.json
  • 12:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1016.eqiad.wmnet
  • 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1015.eqiad.wmnet
  • 12:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1015.eqiad.wmnet
  • 12:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1014.eqiad.wmnet
  • 12:18 claime: trafficserver: move 40% of traffic to mw on k8s - T355532
  • 12:14 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 12:14 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 12:14 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1014.eqiad.wmnet
  • 12:14 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 12:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P56429 and previous config saved to /var/cache/conftool/dbconfig/20240207-121356-marostegui.json
  • 12:13 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 12:12 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 12:12 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 12:12 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 12:12 claime: mw-web, mw-api-ext: Raise replicas for 40% traffic - T355532
  • 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1013.eqiad.wmnet
  • 12:02 volans: uploaded debmonitor-client_0.3.5 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia,bookworm-wikimedia
  • 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-eqiad
  • 11:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T355609)', diff saved to https://phabricator.wikimedia.org/P56428 and previous config saved to /var/cache/conftool/dbconfig/20240207-115849-marostegui.json
  • 11:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1013.eqiad.wmnet
  • 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
  • 11:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
  • 11:56 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-eqiad
  • 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
  • 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
  • 11:49 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
  • 11:48 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
  • 11:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T355609)', diff saved to https://phabricator.wikimedia.org/P56427 and previous config saved to /var/cache/conftool/dbconfig/20240207-113339-marostegui.json
  • 11:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 11:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 11:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T355609)', diff saved to https://phabricator.wikimedia.org/P56426 and previous config saved to /var/cache/conftool/dbconfig/20240207-113317-marostegui.json
  • 11:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P56425 and previous config saved to /var/cache/conftool/dbconfig/20240207-111810-marostegui.json
  • 11:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 11:08 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: mariadb::parsercache
  • 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P56424 and previous config saved to /var/cache/conftool/dbconfig/20240207-110304-marostegui.json
  • 11:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host elastic2109.codfw.wmnet
  • 10:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host elastic2109.codfw.wmnet
  • 10:51 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: mariadb::parsercache
  • 10:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host elastic2108.codfw.wmnet
  • 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T355609)', diff saved to https://phabricator.wikimedia.org/P56423 and previous config saved to /var/cache/conftool/dbconfig/20240207-104757-marostegui.json
  • 10:44 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 10:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host elastic2108.codfw.wmnet
  • 10:39 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 10:37 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:36 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 10:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T355609)', diff saved to https://phabricator.wikimedia.org/P56422 and previous config saved to /var/cache/conftool/dbconfig/20240207-102535-marostegui.json
  • 10:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 10:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 10:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T355609)', diff saved to https://phabricator.wikimedia.org/P56421 and previous config saved to /var/cache/conftool/dbconfig/20240207-102513-marostegui.json
  • 10:24 Dreamy_Jazz: Finished security deploys for T356183
  • 10:23 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2009.codfw.wmnet
  • 10:23 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1009.eqiad.wmnet
  • 10:19 logmsgbot: dreamyjazz Deployed security patch for T356183
  • 10:12 Dreamy_Jazz: Continuing security deploy for T356183
  • 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P56420 and previous config saved to /var/cache/conftool/dbconfig/20240207-101006-marostegui.json
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P56419 and previous config saved to /var/cache/conftool/dbconfig/20240207-095500-marostegui.json
  • 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host elastic2107.codfw.wmnet
  • 09:45 logmsgbot: dreamyjazz Deployed security patch for T356183
  • 09:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host elastic2107.codfw.wmnet
  • 09:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T355609)', diff saved to https://phabricator.wikimedia.org/P56418 and previous config saved to /var/cache/conftool/dbconfig/20240207-093953-marostegui.json
  • 09:31 jayme: removing a bunch of old kernel versions from chartmuseum* to free ~3.5GB disk space
  • 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: mariadb::core_test
  • 09:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db116