Server Admin Log/Archive 75

From Wikitech


2024-01-31

  • 23:11 eileen: * civicrm upgraded from 6344c95e to 6e1e0d21
  • 22:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 22:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 22:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T355609)', diff saved to https://phabricator.wikimedia.org/P56010 and previous config saved to /var/cache/conftool/dbconfig/20240131-222853-marostegui.json
  • 22:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P56009 and previous config saved to /var/cache/conftool/dbconfig/20240131-221347-marostegui.json
  • 22:11 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 06m 43s)
  • 22:05 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 07m 26s)
  • 21:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P56008 and previous config saved to /var/cache/conftool/dbconfig/20240131-215840-marostegui.json
  • 21:54 Dreamy_Jazz: Removed already applied patches for T347708 from /srv/patches
  • 21:48 dancy@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.16 refs T354434 (duration: 06m 47s)
  • 21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T355609)', diff saved to https://phabricator.wikimedia.org/P56007 and previous config saved to /var/cache/conftool/dbconfig/20240131-214334-marostegui.json
  • 21:42 dancy@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.16 refs T354434
  • 21:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T355609)', diff saved to https://phabricator.wikimedia.org/P56006 and previous config saved to /var/cache/conftool/dbconfig/20240131-213454-marostegui.json
  • 21:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 21:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 21:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T355609)', diff saved to https://phabricator.wikimedia.org/P56005 and previous config saved to /var/cache/conftool/dbconfig/20240131-213432-marostegui.json
  • 21:31 Dreamy_Jazz: Security deploy done
  • 21:30 logmsgbot: dreamyjazz Deployed security patch for T356226
  • 21:23 logmsgbot: dreamyjazz Deployed security patch for T356226
  • 21:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P56004 and previous config saved to /var/cache/conftool/dbconfig/20240131-211926-marostegui.json
  • 21:16 Dreamy_Jazz: Doing security deploy for T356226
  • 21:12 jforrester@deploy2002: Finished scap: Backport for Gadget: Bump GADGET_CLASS_VERSION (T356322) (duration: 08m 31s)
  • 21:05 jforrester@deploy2002: jforrester and reedy: Continuing with sync
  • 21:05 jforrester@deploy2002: jforrester and reedy: Backport for Gadget: Bump GADGET_CLASS_VERSION (T356322) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P56003 and previous config saved to /var/cache/conftool/dbconfig/20240131-210419-marostegui.json
  • 21:03 jforrester@deploy2002: Started scap: Backport for Gadget: Bump GADGET_CLASS_VERSION (T356322)
  • 20:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T355609)', diff saved to https://phabricator.wikimedia.org/P56002 and previous config saved to /var/cache/conftool/dbconfig/20240131-204913-marostegui.json
  • 20:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T355609)', diff saved to https://phabricator.wikimedia.org/P56001 and previous config saved to /var/cache/conftool/dbconfig/20240131-204439-marostegui.json
  • 20:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 20:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 20:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 20:37 eevans@deploy2002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
  • 20:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 20:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T355609)', diff saved to https://phabricator.wikimedia.org/P56000 and previous config saved to /var/cache/conftool/dbconfig/20240131-203704-marostegui.json
  • 20:36 eevans@deploy2002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
  • 20:36 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: sync
  • 20:35 eevans@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: sync
  • 20:35 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: sync
  • 20:35 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: sync
  • 20:33 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript userOptions.php --wiki=testwiki --old-is-default --old=2 --new 1 --nowarn 'echo-subscriptions-web-reverted' # T353225
  • 20:32 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
  • 20:31 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
  • 20:28 joal@deploy2002: Finished deploy [analytics/refinery@b738b3f] (hadoop-test): HOTFIX analytics weekly train - Test [analytics/refinery@b738b3fd] (duration: 03m 35s)
  • 20:28 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
  • 20:27 eevans@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
  • 20:25 joal@deploy2002: Started deploy [analytics/refinery@b738b3f] (hadoop-test): HOTFIX analytics weekly train - Test [analytics/refinery@b738b3fd]
  • 20:24 joal@deploy2002: Finished deploy [analytics/refinery@b738b3f] (thin): HOTFIX analytics weekly train -THIN [analytics/refinery@b738b3fd] (duration: 00m 05s)
  • 20:24 joal@deploy2002: Started deploy [analytics/refinery@b738b3f] (thin): HOTFIX analytics weekly train -THIN [analytics/refinery@b738b3fd]
  • 20:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P55999 and previous config saved to /var/cache/conftool/dbconfig/20240131-202158-marostegui.json
  • 20:10 joal@deploy2002: Finished deploy [analytics/refinery@b738b3f]: HOTFIX analytics weekly train [analytics/refinery@b738b3fd] (duration: 10m 51s)
  • 20:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P55998 and previous config saved to /var/cache/conftool/dbconfig/20240131-200652-marostegui.json
  • 19:59 joal@deploy2002: Started deploy [analytics/refinery@b738b3f]: HOTFIX analytics weekly train [analytics/refinery@b738b3fd]
  • 19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T355609)', diff saved to https://phabricator.wikimedia.org/P55997 and previous config saved to /var/cache/conftool/dbconfig/20240131-195145-marostegui.json
  • 19:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1222 (T355609)', diff saved to https://phabricator.wikimedia.org/P55996 and previous config saved to /var/cache/conftool/dbconfig/20240131-193927-marostegui.json
  • 19:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 19:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 19:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T355609)', diff saved to https://phabricator.wikimedia.org/P55994 and previous config saved to /var/cache/conftool/dbconfig/20240131-193905-marostegui.json
  • 19:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P55993 and previous config saved to /var/cache/conftool/dbconfig/20240131-192359-marostegui.json
  • 19:17 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.16 refs T354434
  • 19:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P55992 and previous config saved to /var/cache/conftool/dbconfig/20240131-190852-marostegui.json
  • 18:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T355609)', diff saved to https://phabricator.wikimedia.org/P55991 and previous config saved to /var/cache/conftool/dbconfig/20240131-185345-marostegui.json
  • 18:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T355609)', diff saved to https://phabricator.wikimedia.org/P55990 and previous config saved to /var/cache/conftool/dbconfig/20240131-184900-marostegui.json
  • 18:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 18:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T355609)', diff saved to https://phabricator.wikimedia.org/P55989 and previous config saved to /var/cache/conftool/dbconfig/20240131-184838-marostegui.json
  • 18:40 phuedx@deploy2002: Finished deploy [airflow-dags/analytics@5078a6b]: (no justification provided) (duration: 00m 28s)
  • 18:40 phuedx@deploy2002: Started deploy [airflow-dags/analytics@5078a6b]: (no justification provided)
  • 18:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P55988 and previous config saved to /var/cache/conftool/dbconfig/20240131-183332-marostegui.json
  • 18:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P55986 and previous config saved to /var/cache/conftool/dbconfig/20240131-181825-marostegui.json
  • 18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
  • 18:04 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
  • 18:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T355609)', diff saved to https://phabricator.wikimedia.org/P55985 and previous config saved to /var/cache/conftool/dbconfig/20240131-180319-marostegui.json
  • 17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T355609)', diff saved to https://phabricator.wikimedia.org/P55984 and previous config saved to /var/cache/conftool/dbconfig/20240131-175833-marostegui.json
  • 17:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 17:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T355609)', diff saved to https://phabricator.wikimedia.org/P55983 and previous config saved to /var/cache/conftool/dbconfig/20240131-175811-marostegui.json
  • 17:51 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
  • 17:50 aokoth@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM vrts1001.eqiad.wmnet
  • 17:46 aokoth@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM vrts1001.eqiad.wmnet
  • 17:45 aokoth@cumin1002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM vrts1001.eqiad.wmnet
  • 17:45 aokoth@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM vrts1001.eqiad.wmnet
  • 17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P55982 and previous config saved to /var/cache/conftool/dbconfig/20240131-174305-marostegui.json
  • 17:35 phuedx@deploy2002: Finished deploy [analytics/refinery@bef134c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@bef134c2] (duration: 03m 29s)
  • 17:31 phuedx@deploy2002: Started deploy [analytics/refinery@bef134c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@bef134c2]
  • 17:31 phuedx@deploy2002: Finished deploy [analytics/refinery@bef134c] (thin): Regular analytics weekly train THIN [analytics/refinery@bef134c2] (duration: 00m 08s)
  • 17:30 phuedx@deploy2002: Started deploy [analytics/refinery@bef134c] (thin): Regular analytics weekly train THIN [analytics/refinery@bef134c2]
  • 17:30 phuedx@deploy2002: Finished deploy [analytics/refinery@bef134c]: Regular analytics weekly train [analytics/refinery@bef134c2] (duration: 11m 05s)
  • 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P55981 and previous config saved to /var/cache/conftool/dbconfig/20240131-172758-marostegui.json
  • 17:19 phuedx@deploy2002: Started deploy [analytics/refinery@bef134c]: Regular analytics weekly train [analytics/refinery@bef134c2]
  • 17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T355609)', diff saved to https://phabricator.wikimedia.org/P55980 and previous config saved to /var/cache/conftool/dbconfig/20240131-171252-marostegui.json
  • 17:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T355609)', diff saved to https://phabricator.wikimedia.org/P55979 and previous config saved to /var/cache/conftool/dbconfig/20240131-170141-marostegui.json
  • 17:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 17:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 17:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55978 and previous config saved to /var/cache/conftool/dbconfig/20240131-170120-marostegui.json
  • 17:01 phuedx@deploy2002: Finished deploy [analytics/refinery@2c00cad] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@2c00cad1] (duration: 03m 35s)
  • 16:57 ejegg: fundraising civicrm upgraded from 520337a0 to 6344c95e
  • 16:57 phuedx@deploy2002: Started deploy [analytics/refinery@2c00cad] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@2c00cad1]
  • 16:56 phuedx@deploy2002: Finished deploy [analytics/refinery@2c00cad] (thin): Regular analytics weekly train THIN [analytics/refinery@2c00cad1] (duration: 00m 06s)
  • 16:56 phuedx@deploy2002: Started deploy [analytics/refinery@2c00cad] (thin): Regular analytics weekly train THIN [analytics/refinery@2c00cad1]
  • 16:54 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 16:52 phuedx@deploy2002: Finished deploy [analytics/refinery@2c00cad]: Regular analytics weekly train [analytics/refinery@2c00cad1] (duration: 09m 52s)
  • 16:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P55977 and previous config saved to /var/cache/conftool/dbconfig/20240131-164613-marostegui.json
  • 16:43 phuedx@deploy2002: Started deploy [analytics/refinery@2c00cad]: Regular analytics weekly train [analytics/refinery@2c00cad1]
  • 16:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P55976 and previous config saved to /var/cache/conftool/dbconfig/20240131-163106-marostegui.json
  • 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55974 and previous config saved to /var/cache/conftool/dbconfig/20240131-161600-marostegui.json
  • 16:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55973 and previous config saved to /var/cache/conftool/dbconfig/20240131-160624-marostegui.json
  • 16:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 16:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 16:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T355609)', diff saved to https://phabricator.wikimedia.org/P55972 and previous config saved to /var/cache/conftool/dbconfig/20240131-160602-marostegui.json
  • 16:01 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 15:58 moritzm: installing openssh security updates
  • 15:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moscovium.eqiad.wmnet
  • 15:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host moscovium.eqiad.wmnet
  • 15:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P55970 and previous config saved to /var/cache/conftool/dbconfig/20240131-155055-marostegui.json
  • 15:50 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 15:47 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 15:47 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 15:47 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 15:46 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 15:46 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 15:45 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 15:45 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 15:45 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 15:44 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 15:43 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 15:41 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
  • 15:41 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:41 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
  • 15:39 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
  • 15:36 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
  • 15:36 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=maps2009.codfw.wmnet
  • 15:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P55969 and previous config saved to /var/cache/conftool/dbconfig/20240131-153549-marostegui.json
  • 15:34 hnowlan@puppetmaster1001: conftool action : set/weight=10; selector: name=maps1009.eqiad.wmnet
  • 15:32 ayounsi@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
  • 15:29 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1009.eqiad.wmnet
  • 15:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T355609)', diff saved to https://phabricator.wikimedia.org/P55968 and previous config saved to /var/cache/conftool/dbconfig/20240131-152042-marostegui.json
  • 15:18 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 15:17 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 15:17 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 15:16 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 15:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 15:16 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 15:16 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 15:14 btullis@cumin1002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema
  • 15:14 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 15:14 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 15:14 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 15:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T355609)', diff saved to https://phabricator.wikimedia.org/P55967 and previous config saved to /var/cache/conftool/dbconfig/20240131-151016-marostegui.json
  • 15:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 15:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 15:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55966 and previous config saved to /var/cache/conftool/dbconfig/20240131-150934-marostegui.json
  • 15:09 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 15:08 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 15:08 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 15:07 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 15:06 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 15:05 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 14:58 btullis@cumin1002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema
  • 14:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P55965 and previous config saved to /var/cache/conftool/dbconfig/20240131-145427-marostegui.json
  • 14:53 brouberol: I'm going to apply kafka log compaction for {eqiad,codfw}.mediawiki.currussearch.page_rerender.v1 on kafka-main-eqiad only (current replica) - T354794
  • 14:52 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists2001.codfw.wmnet
  • 14:46 urbanecm@deploy2002: Finished scap: Backport for Add WikimediaCampaignEvents to extension list (T347894) (duration: 10m 41s)
  • 14:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host lists2001.codfw.wmnet
  • 14:43 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 14:40 urbanecm@deploy2002: cmelo and urbanecm: Continuing with sync
  • 14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P55964 and previous config saved to /var/cache/conftool/dbconfig/20240131-143921-marostegui.json
  • 14:37 urbanecm@deploy2002: cmelo and urbanecm: Backport for Add WikimediaCampaignEvents to extension list (T347894) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:36 urbanecm@deploy2002: Started scap: Backport for Add WikimediaCampaignEvents to extension list (T347894)
  • 14:30 urbanecm@deploy2002: Finished scap: Backport for [metawiki] Let admins add/remove the event-organizer group (T356070), index.php: Restore support for forcesafemode option. (T355314) (duration: 10m 05s)
  • 14:28 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 14:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55963 and previous config saved to /var/cache/conftool/dbconfig/20240131-142413-marostegui.json
  • 14:23 urbanecm@deploy2002: daimona and matmarex and urbanecm: Continuing with sync
  • 14:21 urbanecm@deploy2002: daimona and matmarex and urbanecm: Backport for [metawiki] Let admins add/remove the event-organizer group (T356070), index.php: Restore support for forcesafemode option. (T355314) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:21 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2020.codfw.wmnet with reason: Decommissioning — T352469
  • 14:20 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2020.codfw.wmnet with reason: Decommissioning — T352469
  • 14:20 urbanecm@deploy2002: Started scap: Backport for [metawiki] Let admins add/remove the event-organizer group (T356070), index.php: Restore support for forcesafemode option. (T355314)
  • {{safesubst:SAL entry|1=14:19 urbanecm@deploy2002: Finished scap: Backport for decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), Add an exception for ConvenientDiscussions-style permalinks (T349653), [[gerrit:994709|Add an exception for ConvenientDiscussions-style permalinks (T349653)}}
  • 14:18 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript migrateUserGroup.php --wiki=metawiki campaignevents-beta-tester event-organizer # T356070
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1146:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55962 and previous config saved to /var/cache/conftool/dbconfig/20240131-141316-marostegui.json
  • 14:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 14:13 urbanecm@deploy2002: urbanecm and kemayo and matmarex and daimona: Continuing with sync
  • 14:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • {{safesubst:SAL entry|1=14:10 urbanecm@deploy2002: urbanecm and kemayo and matmarex and daimona: Backport for decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), Add an exception for ConvenientDiscussions-style permalinks (T349653), [[gerrit:994709|Add an exception for ConvenientDiscuss}}
  • 14:09 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • {{safesubst:SAL entry|1=14:08 urbanecm@deploy2002: Started scap: Backport for decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), Add an exception for ConvenientDiscussions-style permalinks (T349653), [[gerrit:994709|Add an exception for ConvenientDiscussions-style permalinks (T349653)]}}
  • 14:08 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 14:07 urbanecm@deploy2002: Finished scap: Backport for testwiki: Temporarily change default value for 4 Echo properties (T353225) (duration: 19m 37s)
  • 14:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 14:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 14:00 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 13:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2003.codfw.wmnet
  • 13:51 urbanecm@deploy2002: urbanecm: Backport for testwiki: Temporarily change default value for 4 Echo properties (T353225) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:48 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host people2003.codfw.wmnet
  • 13:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host planet1003.eqiad.wmnet
  • 13:48 urbanecm@deploy2002: Started scap: Backport for testwiki: Temporarily change default value for 4 Echo properties (T353225)
  • 13:44 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host planet1003.eqiad.wmnet
  • 13:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T355609)', diff saved to https://phabricator.wikimedia.org/P55960 and previous config saved to /var/cache/conftool/dbconfig/20240131-133143-marostegui.json
  • 13:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
  • 13:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
  • 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5002.eqsin.wmnet
  • 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P55959 and previous config saved to /var/cache/conftool/dbconfig/20240131-131637-marostegui.json
  • 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5002.eqsin.wmnet
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4002.ulsfo.wmnet
  • 13:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4002.ulsfo.wmnet
  • 13:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3003.esams.wmnet
  • 13:04 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet
  • 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3003.esams.wmnet
  • 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
  • 13:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P55957 and previous config saved to /var/cache/conftool/dbconfig/20240131-130130-marostegui.json
  • 12:58 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet
  • 12:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
  • 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
  • 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
  • 12:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T355609)', diff saved to https://phabricator.wikimedia.org/P55956 and previous config saved to /var/cache/conftool/dbconfig/20240131-124623-marostegui.json
  • 12:44 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 12:44 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 12:44 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 12:44 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 12:42 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host netmon1003.wikimedia.org
  • 12:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T355609)', diff saved to https://phabricator.wikimedia.org/P55955 and previous config saved to /var/cache/conftool/dbconfig/20240131-123224-marostegui.json
  • 12:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 12:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 12:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T355609)', diff saved to https://phabricator.wikimedia.org/P55954 and previous config saved to /var/cache/conftool/dbconfig/20240131-123203-marostegui.json
  • 12:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
  • 12:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
  • 12:24 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host dbstore1009.eqiad.wmnet
  • 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
  • 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
  • 12:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P55953 and previous config saved to /var/cache/conftool/dbconfig/20240131-121656-marostegui.json
  • 12:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
  • 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica2008.wikimedia.org
  • 12:13 claime: Raising external traffic to mw-on-k8s to 35% - T355532
  • 12:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards2001.codfw.wmnet
  • 12:12 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host dbstore1009.eqiad.wmnet
  • 12:11 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dbstore1008.eqiad.wmnet
  • 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica2008.wikimedia.org
  • 12:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica2007.wikimedia.org
  • 12:10 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 12:10 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 12:10 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 12:09 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host stewards2001.codfw.wmnet
  • 12:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards1001.eqiad.wmnet
  • 12:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 12:08 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 12:08 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 12:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica2007.wikimedia.org
  • 12:07 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 12:07 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
  • 12:06 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1006.wikimedia.org
  • 12:05 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
  • 12:05 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
  • 12:04 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host stewards1001.eqiad.wmnet
  • 12:04 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
  • 12:04 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
  • 12:03 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
  • 12:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host planet2003.codfw.wmnet
  • 12:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1006.wikimedia.org
  • 12:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P55952 and previous config saved to /var/cache/conftool/dbconfig/20240131-120150-marostegui.json
  • 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1005.wikimedia.org
  • 12:00 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host dbstore1008.eqiad.wmnet
  • 11:59 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host planet2003.codfw.wmnet
  • 11:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people1004.eqiad.wmnet
  • 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1005.wikimedia.org
  • 11:51 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host people1004.eqiad.wmnet
  • 11:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T355609)', diff saved to https://phabricator.wikimedia.org/P55951 and previous config saved to /var/cache/conftool/dbconfig/20240131-114643-marostegui.json
  • 11:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
  • 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
  • 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
  • 11:38 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker[1157-1175].eqiad.wmnet
  • 11:38 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1157-1175].eqiad.wmnet
  • 11:37 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker[1157-1175].eqiad.wmnet
  • 11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
  • 11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T355609)', diff saved to https://phabricator.wikimedia.org/P55950 and previous config saved to /var/cache/conftool/dbconfig/20240131-113518-marostegui.json
  • 11:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 11:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55949 and previous config saved to /var/cache/conftool/dbconfig/20240131-113456-marostegui.json
  • 11:34 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet
  • 11:29 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1424.eqiad.wmnet with OS bullseye
  • 11:28 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2006.codfw.wmnet
  • 11:27 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host testvm2006.codfw.wmnet with OS bookworm
  • 11:27 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet
  • 11:26 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1423.eqiad.wmnet with OS bullseye
  • 11:24 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1425.eqiad.wmnet with OS bullseye
  • 11:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P55948 and previous config saved to /var/cache/conftool/dbconfig/20240131-111949-marostegui.json
  • 11:11 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1424.eqiad.wmnet with reason: host reimage
  • 11:08 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1423.eqiad.wmnet with reason: host reimage
  • 11:05 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1425.eqiad.wmnet with reason: host reimage
  • 11:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P55947 and previous config saved to /var/cache/conftool/dbconfig/20240131-110442-marostegui.json
  • 11:02 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1424.eqiad.wmnet with reason: host reimage
  • 11:02 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1423.eqiad.wmnet with reason: host reimage
  • 11:01 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1425.eqiad.wmnet with reason: host reimage
  • 10:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
  • 10:53 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
  • 10:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
  • 10:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55946 and previous config saved to /var/cache/conftool/dbconfig/20240131-104936-marostegui.json
  • 10:49 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
  • 10:48 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1424.eqiad.wmnet with OS bullseye
  • 10:48 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1423.eqiad.wmnet with OS bullseye
  • 10:48 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1425.eqiad.wmnet with OS bullseye
  • 10:46 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 10:43 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 10:42 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 10:41 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 10:41 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55945 and previous config saved to /var/cache/conftool/dbconfig/20240131-103830-marostegui.json
  • 10:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 10:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T355609)', diff saved to https://phabricator.wikimedia.org/P55944 and previous config saved to /var/cache/conftool/dbconfig/20240131-103807-marostegui.json
  • 10:36 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1157.eqiad.wmnet
  • 10:35 btullis@deploy2002: Finished deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c] (duration: 00m 07s)
  • 10:35 btullis@deploy2002: Started deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c]
  • 10:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
  • 10:33 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
  • 10:30 btullis@deploy2002: Finished deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c] (duration: 00m 05s)
  • 10:30 btullis@deploy2002: Started deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c]
  • 10:30 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
  • 10:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1002.eqiad.wmnet
  • 10:29 stevemunene@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-worker1157.eqiad.wmnet
  • 10:25 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
  • 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1002.eqiad.wmnet
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P55943 and previous config saved to /var/cache/conftool/dbconfig/20240131-102300-marostegui.json
  • 10:21 cgoubert@cumin2002: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
  • 10:20 cgoubert@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host testreduce1002.eqiad.wmnet
  • 10:20 cgoubert@cumin2002: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
  • 10:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1001.eqiad.wmnet
  • 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P55942 and previous config saved to /var/cache/conftool/dbconfig/20240131-100754-marostegui.json
  • 10:03 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1001.eqiad.wmnet
  • 10:02 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
  • 10:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
  • 09:53 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
  • 09:53 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host moss-be2003.codfw.wmnet
  • 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T355609)', diff saved to https://phabricator.wikimedia.org/P55941 and previous config saved to /var/cache/conftool/dbconfig/20240131-095247-marostegui.json
  • 09:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
  • 09:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
  • 09:51 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
  • 09:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
  • 09:50 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
  • 09:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
  • 09:49 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
  • 09:47 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 09:47 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
  • 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T355609)', diff saved to https://phabricator.wikimedia.org/P55940 and previous config saved to /var/cache/conftool/dbconfig/20240131-094301-marostegui.json
  • 09:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 09:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 09:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55939 and previous config saved to /var/cache/conftool/dbconfig/20240131-094239-marostegui.json
  • 09:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
  • 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5004.wikimedia.org
  • 09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P55938 and previous config saved to /var/cache/conftool/dbconfig/20240131-092733-marostegui.json
  • 09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5004.wikimedia.org
  • 09:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4005.wikimedia.org
  • 09:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast4005.wikimedia.org
  • 09:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P55937 and previous config saved to /var/cache/conftool/dbconfig/20240131-091226-marostegui.json
  • 09:08 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet
  • 09:07 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host sretest1003.eqiad.wmnet
  • 09:01 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet
  • 08:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55936 and previous config saved to /var/cache/conftool/dbconfig/20240131-085719-marostegui.json
  • 08:55 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet
  • 08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet
  • 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet
  • 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55935 and previous config saved to /var/cache/conftool/dbconfig/20240131-084700-marostegui.json
  • 08:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 08:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 08:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T355609)', diff saved to https://phabricator.wikimedia.org/P55934 and previous config saved to /var/cache/conftool/dbconfig/20240131-084637-marostegui.json
  • 08:45 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet
  • 08:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet
  • 08:44 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet
  • 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host crm2001.codfw.wmnet
  • 08:40 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet
  • 08:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host crm2001.codfw.wmnet
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 100%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55932 and previous config saved to /var/cache/conftool/dbconfig/20240131-083142-root.json
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P55931 and previous config saved to /var/cache/conftool/dbconfig/20240131-083130-marostegui.json
  • 08:27 moritzm: installing systemd bugfix updates from bookworm 12.4 point release
  • 08:21 moritzm: installing systemd bugfix updates from bookworm 12.4 point release
  • 08:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2004.codfw.wmnet
  • 08:18 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm1001.wikimedia.org
  • 08:17 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 75%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55930 and previous config saved to /var/cache/conftool/dbconfig/20240131-081637-root.json
  • 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2004.codfw.wmnet
  • 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P55929 and previous config saved to /var/cache/conftool/dbconfig/20240131-081624-marostegui.json
  • 08:14 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idm1001.wikimedia.org
  • 08:13 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm2001.wikimedia.org
  • 08:13 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 08:09 moritzm: installing ca-certificates-java bugfix updates from bookworm 12.4 point release
  • 08:09 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idm2001.wikimedia.org
  • 08:09 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm-test1001.wikimedia.org
  • 08:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
  • 08:05 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idm-test1001.wikimedia.org
  • 08:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
  • 08:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 50%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55928 and previous config saved to /var/cache/conftool/dbconfig/20240131-080132-root.json
  • 08:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T355609)', diff saved to https://phabricator.wikimedia.org/P55927 and previous config saved to /var/cache/conftool/dbconfig/20240131-080117-marostegui.json
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
  • 07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T355609)', diff saved to https://phabricator.wikimedia.org/P55926 and previous config saved to /var/cache/conftool/dbconfig/20240131-075600-marostegui.json
  • 07:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 07:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T355609)', diff saved to https://phabricator.wikimedia.org/P55925 and previous config saved to /var/cache/conftool/dbconfig/20240131-075522-marostegui.json
  • 07:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
  • 07:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 25%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55924 and previous config saved to /var/cache/conftool/dbconfig/20240131-074627-root.json
  • 07:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
  • 07:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
  • 07:42 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P55923 and previous config saved to /var/cache/conftool/dbconfig/20240131-074015-marostegui.json
  • 07:39 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 07:38 ayounsi@cumin1002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
  • 07:38 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 10%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55922 and previous config saved to /var/cache/conftool/dbconfig/20240131-073121-root.json
  • 07:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P55921 and previous config saved to /var/cache/conftool/dbconfig/20240131-072509-marostegui.json
  • 07:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55920 and previous config saved to /var/cache/conftool/dbconfig/20240131-072129-root.json
  • 07:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS bookworm
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 5%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55919 and previous config saved to /var/cache/conftool/dbconfig/20240131-071616-root.json
  • 07:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T355609)', diff saved to https://phabricator.wikimedia.org/P55918 and previous config saved to /var/cache/conftool/dbconfig/20240131-071002-marostegui.json
  • 07:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55917 and previous config saved to /var/cache/conftool/dbconfig/20240131-070624-root.json
  • 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 1%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55916 and previous config saved to /var/cache/conftool/dbconfig/20240131-070111-root.json
  • 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T355609)', diff saved to https://phabricator.wikimedia.org/P55915 and previous config saved to /var/cache/conftool/dbconfig/20240131-065922-marostegui.json
  • 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107 (T355609)', diff saved to https://phabricator.wikimedia.org/P55914 and previous config saved to /var/cache/conftool/dbconfig/20240131-065901-marostegui.json
  • 06:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
  • 06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2114.codfw.wmnet with OS bookworm
  • 06:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
  • 06:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55913 and previous config saved to /var/cache/conftool/dbconfig/20240131-065118-root.json
  • 06:47 moritzm: installing glibc security updates on bookworm
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107', diff saved to https://phabricator.wikimedia.org/P55912 and previous config saved to /var/cache/conftool/dbconfig/20240131-064353-marostegui.json
  • 06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2114.codfw.wmnet with reason: host reimage
  • 06:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2114.codfw.wmnet with reason: host reimage
  • 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55911 and previous config saved to /var/cache/conftool/dbconfig/20240131-063613-root.json
  • 06:35 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS bookworm
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107', diff saved to https://phabricator.wikimedia.org/P55910 and previous config saved to /var/cache/conftool/dbconfig/20240131-062846-marostegui.json
  • 06:22 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2114.codfw.wmnet with OS bookworm
  • 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55909 and previous config saved to /var/cache/conftool/dbconfig/20240131-062109-root.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2114 T354506', diff saved to https://phabricator.wikimedia.org/P55908 and previous config saved to /var/cache/conftool/dbconfig/20240131-061932-root.json
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107 (T355609)', diff saved to https://phabricator.wikimedia.org/P55907 and previous config saved to /var/cache/conftool/dbconfig/20240131-061340-marostegui.json
  • 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55906 and previous config saved to /var/cache/conftool/dbconfig/20240131-060602-root.json
  • 06:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2107 (T355609)', diff saved to https://phabricator.wikimedia.org/P55905 and previous config saved to /var/cache/conftool/dbconfig/20240131-060337-marostegui.json
  • 06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2107.codfw.wmnet with reason: Maintenance
  • 06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2107.codfw.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 05:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55904 and previous config saved to /var/cache/conftool/dbconfig/20240131-055057-root.json
  • 05:41 eileen: civicrm upgraded from 6de61520 to 520337a0
  • 05:30 fab@deploy2002: Finished deploy [airflow-dags/research@97c6a4e]: (no justification provided) (duration: 00m 14s)
  • 05:30 fab@deploy2002: Started deploy [airflow-dags/research@97c6a4e]: (no justification provided)
  • 03:29 eileen: tools upgraded from 02281338 to c823e692
  • 03:05 fab@deploy2002: Finished deploy [airflow-dags/research@6a97a34]: (no justification provided) (duration: 00m 23s)
  • 03:05 fab@deploy2002: Started deploy [airflow-dags/research@6a97a34]: (no justification provided)

2024-01-30

  • 23:54 mutante: LDAP - added aklapper to group releng T356043
  • 23:07 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1006.eqiad.wmnet
  • 23:07 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for sessionstore1006.eqiad.wmnet
  • 22:49 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore1006.eqiad.wmnet with reason: Bootstrapping — T353402
  • 22:48 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore1006.eqiad.wmnet with reason: Bootstrapping — T353402
  • 22:41 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
  • 22:20 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1005.eqiad.wmnet
  • 22:20 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for sessionstore1005.eqiad.wmnet
  • 22:10 cjming: end of UTC late backport window
  • 22:09 cjming@deploy2002: Finished scap: Backport for [eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033) (duration: 08m 24s)
  • 22:02 cjming@deploy2002: cjming and superpes: Continuing with sync
  • 22:02 cjming@deploy2002: cjming and superpes: Backport for [eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:00 cjming@deploy2002: Started scap: Backport for [eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033)
  • 21:59 cjming@deploy2002: Finished scap: Backport for [ukwiki] Change autoconfirmed setting (T355972), [ganwiki] Add 'suppressredirect' to transwiki usergroup and change assignment and revocation methods (T354850), [ganwiki] Add new namespace aliases (T355854) (duration: 09m 32s)
  • 21:53 cjming@deploy2002: superpes and cjming: Continuing with sync
  • 21:51 cjming@deploy2002: superpes and cjming: Backport for [ukwiki] Change autoconfirmed setting (T355972), [ganwiki] Add 'suppressredirect' to transwiki usergroup and change assignment and revocation methods (T354850), [ganwiki] Add new namespace aliases (T355854) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:50 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore1005.eqiad.wmnet with reason: Bootstrapping — T353402
  • 21:50 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore1005.eqiad.wmnet with reason: Bootstrapping — T353402
  • 21:49 cjming@deploy2002: Started scap: Backport for [ukwiki] Change autoconfirmed setting (T355972), [ganwiki] Add 'suppressredirect' to transwiki usergroup and change assignment and revocation methods (T354850), [ganwiki] Add new namespace aliases (T355854)
  • 21:44 cjming@deploy2002: Finished scap: Backport for Run CheckerJob against read-only clusters (T354793) (duration: 07m 41s)
  • 21:42 mutante: LDAP - added jnuche to group releng (T356043) - already done/approved in the past in T301149
  • 21:41 mutante: LDAP - added jhuneidi to group releng (T356043) - already done/approved in the past in T210028
  • 21:40 mutante: LDAP - added brennen to group releng (T356043) - already done/approved in the past in T215365
  • 21:38 cjming@deploy2002: cjming and ebernhardson: Continuing with sync
  • 21:38 cjming@deploy2002: cjming and ebernhardson: Backport for Run CheckerJob against read-only clusters (T354793) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:37 cjming@deploy2002: Started scap: Backport for Run CheckerJob against read-only clusters (T354793)
  • 21:36 cjming@deploy2002: Finished scap: Backport for Run CheckerJob against read-only clusters (T354793) (duration: 07m 49s)
  • 21:34 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
  • 21:30 cjming@deploy2002: ebernhardson and cjming: Continuing with sync
  • 21:30 cjming@deploy2002: ebernhardson and cjming: Backport for Run CheckerJob against read-only clusters (T354793) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:28 cjming@deploy2002: Started scap: Backport for Run CheckerJob against read-only clusters (T354793)
  • 21:01 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1004.eqiad.wmnet
  • 21:01 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for sessionstore1004.eqiad.wmnet
  • 20:52 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
  • 20:51 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
  • 20:38 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore1004.eqiad.wmnet with reason: Commissioning — T353402
  • 20:38 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore1004.eqiad.wmnet with reason: Commissioning — T353402
  • 20:35 urandom: bootstrapping sessionstore1004/cassandra-a — T353402
  • 20:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wdqs::public
  • 19:45 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wdqs::public
  • 19:36 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in cloudelastic
  • 19:36 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in cloudelastic
  • 19:36 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010.eqiad.wmnet for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
  • 19:36 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010.eqiad.wmnet for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
  • 19:27 Lucas_WMDE: FINISHED lucaswerkmeister-wmde@mwmaint2002:~$ mwscript CheckSignatures enwiki | tee T356168 # -- 268378 invalid signatures --
  • 19:10 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.16 refs T354434
  • 19:09 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
  • 18:52 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 18:52 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 18:46 xcollazo@deploy2002: Finished deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided) (duration: 00m 05s)
  • 18:46 xcollazo@deploy2002: Started deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided)
  • 18:17 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 18:16 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
  • 18:05 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 18:04 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 18:04 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 18:04 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 18:04 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 18:03 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 18:03 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 18:03 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 18:02 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 18:02 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 18:02 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 18:02 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 17:37 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
  • 17:37 urandom: DROP test_spark3_loading keyspace, Generated Data (Cassandra) cluster — T356112
  • 17:22 jforrester@deploy2002: Finished scap: Backport for Do not search for elements if no previews have been registered (T355933 T356186 T356193), Do not search for elements if no previews have been registered (T355933 T356186 T356193) (duration: 11m 51s)
  • 17:21 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 17:15 jforrester@deploy2002: jforrester: Continuing with sync
  • 17:14 jforrester@deploy2002: jforrester: Backport for Do not search for elements if no previews have been registered (T355933 T356186 T356193), Do not search for elements if no previews have been registered (T355933 T356186 T356193) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:13 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2005.codfw.wmnet with OS bookworm
  • 17:10 jforrester@deploy2002: Started scap: Backport for Do not search for elements if no previews have been registered (T355933 T356186 T356193), Do not search for elements if no previews have been registered (T355933 T356186 T356193)
  • 16:57 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
  • 16:56 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1009.wikimedia.org
  • 16:56 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1008.wikimedia.org
  • 16:56 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1007.wikimedia.org
  • 16:54 claime: Running homer 'cr*codfw*' commit 'T351074'
  • 16:54 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: sync
  • 16:54 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: sync
  • 16:49 mutante: gitlab is back
  • 16:48 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 16:47 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 16:47 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 16:47 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 16:44 mutante: gitlab is down for maintenance for a few minutes
  • 16:34 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
  • 16:29 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on gitlab.wikimedia.org with reason: server move
  • 16:29 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on gitlab.wikimedia.org with reason: server move
  • 16:28 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on gitlab2002.wikimedia.org with reason: server move
  • 16:28 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on gitlab2002.wikimedia.org with reason: server move
  • 16:25 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1466.eqiad.wmnet with OS bullseye
  • 16:21 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1457.eqiad.wmnet with OS bullseye
  • 16:18 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2366.codfw.wmnet with OS bullseye
  • 16:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1440.eqiad.wmnet with OS bullseye
  • 16:14 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
  • 16:13 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1008.wikimedia.org
  • 16:13 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2370.codfw.wmnet with OS bullseye
  • 16:11 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 16:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1482.eqiad.wmnet with OS bullseye
  • 16:08 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2368.codfw.wmnet with OS bullseye
  • 16:06 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1466.eqiad.wmnet with reason: host reimage
  • 16:03 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1459.eqiad.wmnet with OS bullseye
  • 16:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1457.eqiad.wmnet with reason: host reimage
  • 15:59 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2366.codfw.wmnet with reason: host reimage
  • 15:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
  • 15:58 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
  • 15:56 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1440.eqiad.wmnet with reason: host reimage
  • 15:54 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
  • 15:53 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2370.codfw.wmnet with reason: host reimage
  • 15:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1482.eqiad.wmnet with reason: host reimage
  • 15:47 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2368.codfw.wmnet with reason: host reimage
  • 15:44 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1459.eqiad.wmnet with reason: host reimage
  • 15:42 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2370.codfw.wmnet with reason: host reimage
  • 15:42 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1457.eqiad.wmnet with reason: host reimage
  • 15:42 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1466.eqiad.wmnet with reason: host reimage
  • 15:42 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2366.codfw.wmnet with reason: host reimage
  • 15:42 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1440.eqiad.wmnet with reason: host reimage
  • 15:41 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2368.codfw.wmnet with reason: host reimage
  • 15:41 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1482.eqiad.wmnet with reason: host reimage
  • 15:41 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1459.eqiad.wmnet with reason: host reimage
  • 15:40 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
  • 15:29 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript CheckSignatures enwiki | tee T356168
  • 15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1466.eqiad.wmnet with OS bullseye
  • 15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1459.eqiad.wmnet with OS bullseye
  • 15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1482.eqiad.wmnet with OS bullseye
  • 15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1457.eqiad.wmnet with OS bullseye
  • 15:27 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1440.eqiad.wmnet with OS bullseye
  • 15:26 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:26 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2370.codfw.wmnet with OS bullseye
  • 15:25 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2368.codfw.wmnet with OS bullseye
  • 15:25 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2366.codfw.wmnet with OS bullseye
  • 15:17 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes enwikiquote --fix # T355195 (two pages will need separate fixing)
  • 15:17 claime: Recomissioning mw2366.codfw.wmnet,mw2368.codfw.wmnet,mw2370.codfw.wmnet as k8s nodes - T351074
  • 15:17 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host sretest2005.codfw.wmnet
  • 15:17 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
  • 15:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [enwikiquote] Add a draft namespace and its talk space (T355195) (duration: 08m 43s)
  • 15:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Continuing with sync
  • 15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Backport for [enwikiquote] Add a draft namespace and its talk space (T355195) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [enwikiquote] Add a draft namespace and its talk space (T355195)
  • 15:06 claime: Manual run of mediawiki_job_generatecaptcha.service following timer failure - T141490
  • 15:06 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes enwiktionary --fix # T354813
  • 15:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [enwiktionary] Remove the Concordance namespace and its talk space (T354813) (duration: 09m 57s)
  • 14:59 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
  • 14:57 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [enwiktionary] Remove the Concordance namespace and its talk space (T354813) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:55 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [enwiktionary] Remove the Concordance namespace and its talk space (T354813)
  • 14:52 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes azwiki --fix # T355041, failed at the end :(
  • 14:52 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [azwiki] Changing 9 namespace aliases (T355041) (duration: 08m 37s)
  • 14:46 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
  • 14:45 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [azwiki] Changing 9 namespace aliases (T355041) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:43 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [azwiki] Changing 9 namespace aliases (T355041)
  • 14:41 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for CommentParser: Ignore generated timestamp links (T356142), CommentParser: Ignore generated timestamp links (T356142), Add maintenance script to list users with invalid signatures (T356168) (duration: 11m 01s)
  • 14:40 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 14:35 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Continuing with sync
  • 14:32 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 14:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Backport for CommentParser: Ignore generated timestamp links (T356142), CommentParser: Ignore generated timestamp links (T356142), Add maintenance script to list users with invalid signatures (T356168) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:31 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 14:31 gmodena@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for CommentParser: Ignore generated timestamp links (T356142), CommentParser: Ignore generated timestamp links (T356142), Add maintenance script to list users with invalid signatures (T356168)
  • 14:30 gmodena@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 14:30 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 14:26 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 14:26 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 backport Cancelled
  • 14:18 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Don't bail out early when there are no selectors configured (T355933) (duration: 09m 04s)
  • 14:12 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Continuing with sync
  • 14:11 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Backport for Don't bail out early when there are no selectors configured (T355933) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:11 volans@cumin2002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 14:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Don't bail out early when there are no selectors configured (T355933)
  • 14:09 volans@cumin2002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 13:56 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 13:55 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
  • 13:55 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
  • 13:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2005.codfw.wmnet on all recursors
  • 13:54 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2005.codfw.wmnet on all recursors
  • 13:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
  • 13:53 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
  • 13:47 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 13:47 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host sretest2005.codfw.wmnet
  • 13:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts srestest2005.codfw.wmnet
  • 13:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: srestest2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
  • 13:44 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: srestest2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
  • 13:39 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 13:37 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1157-1175].eqiad.wmnet
  • 13:36 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts srestest2005.codfw.wmnet
  • 13:34 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=94) for new host srestest2005.codfw.wmnet
  • 13:33 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
  • 13:33 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
  • 13:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
  • 13:32 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
  • 13:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
  • 13:31 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
  • 13:26 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 13:26 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
  • 13:16 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=93) for new host srestest2005.codfw.wmnet
  • 13:16 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
  • 13:16 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
  • 13:16 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:16 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
  • 13:15 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
  • 13:12 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 13:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
  • 13:12 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
  • 13:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
  • 13:10 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
  • 13:08 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker[1159-1175].eqiad.wmnet
  • 13:08 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1159-1175].eqiad.wmnet
  • 13:08 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 13:08 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
  • 13:06 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker1158.eqiad.wmnet
  • 13:04 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1158.eqiad.wmnet
  • 12:19 taavi: reprepro import exim4 4.96-15+deb12u4+wmf1 to component/exim4-arc T356171
  • 11:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T343718)', diff saved to https://phabricator.wikimedia.org/P55896 and previous config saved to /var/cache/conftool/dbconfig/20240130-114726-ladsgroup.json
  • 11:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1005.eqiad.wmnet
  • 11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P55895 and previous config saved to /var/cache/conftool/dbconfig/20240130-113220-ladsgroup.json
  • 11:30 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker1157.eqiad.wmnet
  • 11:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-airflow1005.eqiad.wmnet
  • 11:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 11:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 11:19 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1157.eqiad.wmnet
  • 11:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P55894 and previous config saved to /var/cache/conftool/dbconfig/20240130-111713-ladsgroup.json
  • 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::search
  • 11:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 11:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 11:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::search
  • 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T343718)', diff saved to https://phabricator.wikimedia.org/P55893 and previous config saved to /var/cache/conftool/dbconfig/20240130-110207-ladsgroup.json
  • 10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 (T343718)', diff saved to https://phabricator.wikimedia.org/P55892 and previous config saved to /var/cache/conftool/dbconfig/20240130-105954-ladsgroup.json
  • 10:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 10:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 10:56 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1005.eqiad.wmnet with OS bullseye
  • 10:56 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 10:45 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 10:35 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=93) for new host srestest2005.codfw.wmnet
  • 10:35 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
  • 10:35 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
  • 10:35 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:35 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
  • 10:34 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 10:34 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
  • 10:32 volans@cumin1002: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox-canary
  • 10:32 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1005.eqiad.wmnet with reason: host reimage
  • 10:31 volans@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 10:31 volans@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 10:31 volans@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 10:29 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 10:29 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
  • 10:29 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
  • 10:29 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:28 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
  • 10:28 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
  • 10:26 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1005.eqiad.wmnet with reason: host reimage
  • 10:26 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 10:25 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
  • 10:24 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host srestest2005.codfw.wmnet
  • 10:24 ayounsi@cumin1002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
  • 10:23 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 10:23 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
  • 10:23 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 10:16 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-airflow1005.eqiad.wmnet with OS bullseye
  • 10:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host phab1004.eqiad.wmnet
  • 10:00 gmodena@deploy2002: Finished deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided) (duration: 00m 37s)
  • 10:00 gmodena@deploy2002: Started deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided)
  • 09:56 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host phab1004.eqiad.wmnet
  • 09:30 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-tool1008.eqiad.wmnet with OS bullseye
  • 09:14 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1008.eqiad.wmnet with reason: host reimage
  • 09:11 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1008.eqiad.wmnet with reason: host reimage
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 100%: Switchover', diff saved to https://phabricator.wikimedia.org/P55891 and previous config saved to /var/cache/conftool/dbconfig/20240130-090704-root.json
  • 09:00 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-tool1008.eqiad.wmnet with OS bullseye
  • 08:57 Emperor: restart swift-object-replicator on ms-be1068
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 75%: Switchover', diff saved to https://phabricator.wikimedia.org/P55890 and previous config saved to /var/cache/conftool/dbconfig/20240130-085159-root.json
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55889 and previous config saved to /var/cache/conftool/dbconfig/20240130-085055-root.json
  • 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55888 and previous config saved to /var/cache/conftool/dbconfig/20240130-083829-root.json
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 50%: Switchover', diff saved to https://phabricator.wikimedia.org/P55887 and previous config saved to /var/cache/conftool/dbconfig/20240130-083654-root.json
  • 08:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55886 and previous config saved to /var/cache/conftool/dbconfig/20240130-083550-root.json
  • 08:29 moritzm: upgrading python-pymysql on remaining DB hosts to 1.0.2-2~wmf11u1 T355531
  • 08:28 ladsgroup@deploy2002: Finished scap: Backport for Enable PageNotice extension on testwiki (T61245) (duration: 10m 24s)
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55885 and previous config saved to /var/cache/conftool/dbconfig/20240130-082324-root.json
  • 08:22 ladsgroup@deploy2002: ladsgroup and tto: Continuing with sync
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 25%: Switchover', diff saved to https://phabricator.wikimedia.org/P55884 and previous config saved to /var/cache/conftool/dbconfig/20240130-082149-root.json
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55883 and previous config saved to /var/cache/conftool/dbconfig/20240130-082045-root.json
  • 08:19 ladsgroup@deploy2002: ladsgroup and tto: Backport for Enable PageNotice extension on testwiki (T61245) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:18 ladsgroup@deploy2002: Started scap: Backport for Enable PageNotice extension on testwiki (T61245)
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55882 and previous config saved to /var/cache/conftool/dbconfig/20240130-080819-root.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 10%: Switchover', diff saved to https://phabricator.wikimedia.org/P55881 and previous config saved to /var/cache/conftool/dbconfig/20240130-080644-root.json
  • 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55880 and previous config saved to /var/cache/conftool/dbconfig/20240130-080540-root.json
  • 07:55 ayounsi@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2034.codfw.wmnet to cluster codfw02 and group AB
  • 07:53 ayounsi@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2034.codfw.wmnet to cluster codfw02 and group AB
  • 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55879 and previous config saved to /var/cache/conftool/dbconfig/20240130-075314-root.json
  • 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55878 and previous config saved to /var/cache/conftool/dbconfig/20240130-075035-root.json
  • 07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2105 T356069', diff saved to https://phabricator.wikimedia.org/P55877 and previous config saved to /var/cache/conftool/dbconfig/20240130-074746-root.json
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2127 to s3 primary and set section read-write T356069', diff saved to https://phabricator.wikimedia.org/P55876 and previous config saved to /var/cache/conftool/dbconfig/20240130-074656-marostegui.json
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Set s3 codfw as read-only for maintenance - T356069', diff saved to https://phabricator.wikimedia.org/P55875 and previous config saved to /var/cache/conftool/dbconfig/20240130-074634-marostegui.json
  • 07:46 marostegui: Starting s3 codfw failover from db2105 to db2127 - T356069
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55874 and previous config saved to /var/cache/conftool/dbconfig/20240130-073807-root.json
  • 07:33 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s3 T356069
  • 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2127 with weight 0 T356069', diff saved to https://phabricator.wikimedia.org/P55873 and previous config saved to /var/cache/conftool/dbconfig/20240130-073257-marostegui.json
  • 07:32 root@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 23 hosts with reason: Primary switchover s3 T356069
  • 07:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55872 and previous config saved to /var/cache/conftool/dbconfig/20240130-072734-root.json
  • 07:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P55871 and previous config saved to /var/cache/conftool/dbconfig/20240130-072302-root.json
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55870 and previous config saved to /var/cache/conftool/dbconfig/20240130-071612-root.json
  • 07:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55869 and previous config saved to /var/cache/conftool/dbconfig/20240130-071229-root.json
  • 07:12 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2144 to x2 master T356060', diff saved to https://phabricator.wikimedia.org/P55868 and previous config saved to /var/cache/conftool/dbconfig/20240130-071202-root.json
  • 07:07 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P55867 and previous config saved to /var/cache/conftool/dbconfig/20240130-070757-root.json
  • 07:02 marostegui@deploy2002: Finished scap: Backport for Revert "db-production.php: Disable writes on es4" (duration: 07m 48s)
  • 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55866 and previous config saved to /var/cache/conftool/dbconfig/20240130-070107-root.json
  • 07:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover x2 T356060
  • 07:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover x2 T356060
  • 06:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55865 and previous config saved to /var/cache/conftool/dbconfig/20240130-065724-root.json
  • 06:55 marostegui@deploy2002: marostegui: Continuing with sync
  • 06:55 marostegui@deploy2002: marostegui: Backport for Revert "db-production.php: Disable writes on es4" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 06:54 marostegui@deploy2002: Started scap: Backport for Revert "db-production.php: Disable writes on es4"
  • 06:48 marostegui@deploy2002: backport Cancelled
  • 06:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55864 and previous config saved to /var/cache/conftool/dbconfig/20240130-064602-root.json
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2020 T356064', diff saved to https://phabricator.wikimedia.org/P55863 and previous config saved to /var/cache/conftool/dbconfig/20240130-064526-root.json
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Reduce es2021 weight T356064', diff saved to https://phabricator.wikimedia.org/P55862 and previous config saved to /var/cache/conftool/dbconfig/20240130-064512-root.json
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55861 and previous config saved to /var/cache/conftool/dbconfig/20240130-064219-root.json
  • 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2021 to es4 primary T356064', diff saved to https://phabricator.wikimedia.org/P55860 and previous config saved to /var/cache/conftool/dbconfig/20240130-063625-root.json
  • 06:35 marostegui: Starting es4 codfw failover from es2020 to es2021 - T356064
  • 06:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55859 and previous config saved to /var/cache/conftool/dbconfig/20240130-063057-root.json
  • 06:30 marostegui@deploy2002: Finished scap: Backport for db-production.php: Disable writes on es4 (T356064) (duration: 09m 11s)
  • 06:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1224 T354591', diff saved to https://phabricator.wikimedia.org/P55858 and previous config saved to /var/cache/conftool/dbconfig/20240130-062930-root.json
  • 06:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55857 and previous config saved to /var/cache/conftool/dbconfig/20240130-062714-root.json
  • 06:23 marostegui@deploy2002: marostegui: Continuing with sync
  • 06:22 marostegui@deploy2002: marostegui: Backport for db-production.php: Disable writes on es4 (T356064) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 06:22 marostegui@cumin1002: dbctl commit (dc=all): 'Set es2020 with weight 0 T356064', diff saved to https://phabricator.wikimedia.org/P55856 and previous config saved to /var/cache/conftool/dbconfig/20240130-062241-marostegui.json
  • 06:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
  • 06:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
  • 06:21 marostegui@deploy2002: Started scap: Backport for db-production.php: Disable writes on es4 (T356064)
  • 06:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
  • 06:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
  • 06:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55855 and previous config saved to /var/cache/conftool/dbconfig/20240130-061552-root.json
  • 06:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2103 T356059', diff saved to https://phabricator.wikimedia.org/P55854 and previous config saved to /var/cache/conftool/dbconfig/20240130-061529-root.json
  • 06:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P55853 and previous config saved to /var/cache/conftool/dbconfig/20240130-061423-root.json
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2112 to s1 primary and set section read-write T356059', diff saved to https://phabricator.wikimedia.org/P55852 and previous config saved to /var/cache/conftool/dbconfig/20240130-061305-marostegui.json
  • 06:12 marostegui@cumin1002: dbctl commit (dc=all): 'Set s1 codfw as read-only for maintenance - T356059', diff saved to https://phabricator.wikimedia.org/P55851 and previous config saved to /var/cache/conftool/dbconfig/20240130-061243-marostegui.json
  • 06:12 marostegui: Starting s1 codfw failover from db2103 to db2112 - T356059
  • 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55850 and previous config saved to /var/cache/conftool/dbconfig/20240130-061014-root.json
  • 06:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P55849 and previous config saved to /var/cache/conftool/dbconfig/20240130-060727-root.json
  • 05:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 36 hosts with reason: Primary switchover s1 T356059
  • 05:44 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2112 with weight 0 T356059', diff saved to https://phabricator.wikimedia.org/P55848 and previous config saved to /var/cache/conftool/dbconfig/20240130-054410-marostegui.json
  • 05:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 36 hosts with reason: Primary switchover s1 T356059
  • 05:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2114 T355739', diff saved to https://phabricator.wikimedia.org/P55847 and previous config saved to /var/cache/conftool/dbconfig/20240130-054154-root.json
  • 05:40 marostegui@cumin1002: dbctl commit (dc=all): 'Set s6 codfw as read-only for maintenance - T355739', diff saved to https://phabricator.wikimedia.org/P55845 and previous config saved to /var/cache/conftool/dbconfig/20240130-054025-root.json
  • 05:40 marostegui: Starting s6 codfw failover from db2114 to db2129 - T355739
  • 05:19 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2129 with weight 0 T355739', diff saved to https://phabricator.wikimedia.org/P55844 and previous config saved to /var/cache/conftool/dbconfig/20240130-051952-marostegui.json
  • 05:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355739
  • 05:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355739
  • 04:57 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.16 refs T354434 (duration: 52m 38s)
  • 04:04 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.16 refs T354434
  • 04:02 mwpresync@deploy2002: Pruned MediaWiki: 1.42.0-wmf.13 (duration: 02m 09s)
  • 03:30 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 03:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 00:00 eileen: tools upgraded from 117e1f9c to 544301bd

2024-01-29

  • 22:31 catrope@deploy2002: Finished scap: Backport for Drop English Wikipedia configuration for wgMFUseDesktopSpecialHistoryPage (T353388) (duration: 28m 33s)
  • 22:24 catrope@deploy2002: catrope and jdlrobson: Continuing with sync
  • 22:03 catrope@deploy2002: catrope and jdlrobson: Backport for Drop English Wikipedia configuration for wgMFUseDesktopSpecialHistoryPage (T353388) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:02 catrope@deploy2002: Started scap: Backport for Drop English Wikipedia configuration for wgMFUseDesktopSpecialHistoryPage (T353388)
  • 21:54 catrope@deploy2002: Finished scap: Backport for Use desktop history page HTML everywhere (T353388), Begin capturing errors for Wikivoyage (duration: 12m 05s)
  • 21:48 catrope@deploy2002: catrope and jdlrobson: Continuing with sync
  • 21:43 catrope@deploy2002: catrope and jdlrobson: Backport for Use desktop history page HTML everywhere (T353388), Begin capturing errors for Wikivoyage synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:42 catrope@deploy2002: Started scap: Backport for Use desktop history page HTML everywhere (T353388), Begin capturing errors for Wikivoyage
  • 21:36 catrope@deploy2002: Finished scap: Backport for DiscussionTools: Enable permalinks frontend everywhere except en.wiki (T356063) (duration: 12m 19s)
  • 21:30 catrope@deploy2002: catrope and esanders: Continuing with sync
  • 21:25 catrope@deploy2002: catrope and esanders: Backport for DiscussionTools: Enable permalinks frontend everywhere except en.wiki (T356063) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:24 catrope@deploy2002: Started scap: Backport for DiscussionTools: Enable permalinks frontend everywhere except en.wiki (T356063)
  • 21:17 catrope@deploy2002: Finished scap: Backport for cirrus: Disable cloudelastic writes to testwiki and mw.org (T352335) (duration: 08m 40s)
  • 21:11 catrope@deploy2002: ebernhardson and catrope: Continuing with sync
  • 21:10 catrope@deploy2002: ebernhardson and catrope: Backport for cirrus: Disable cloudelastic writes to testwiki and mw.org (T352335) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:09 catrope@deploy2002: Started scap: Backport for cirrus: Disable cloudelastic writes to testwiki and mw.org (T352335)
  • 20:37 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:37 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T355609)', diff saved to https://phabricator.wikimedia.org/P55843 and previous config saved to /var/cache/conftool/dbconfig/20240129-202740-marostegui.json
  • 20:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P55842 and previous config saved to /var/cache/conftool/dbconfig/20240129-201233-marostegui.json
  • 19:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P55841 and previous config saved to /var/cache/conftool/dbconfig/20240129-195725-marostegui.json
  • 19:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T355609)', diff saved to https://phabricator.wikimedia.org/P55840 and previous config saved to /var/cache/conftool/dbconfig/20240129-194218-marostegui.json
  • 19:36 zabe@deploy2002: Finished scap: Backport for Start reading from af_actor/afh_actor everywhere (T355616) (duration: 09m 09s)
  • 19:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T355609)', diff saved to https://phabricator.wikimedia.org/P55839 and previous config saved to /var/cache/conftool/dbconfig/20240129-193317-marostegui.json
  • 19:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 19:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 19:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55838 and previous config saved to /var/cache/conftool/dbconfig/20240129-193254-marostegui.json
  • 19:29 zabe@deploy2002: zabe: Continuing with sync
  • 19:28 zabe@deploy2002: zabe: Backport for Start reading from af_actor/afh_actor everywhere (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:27 zabe@deploy2002: Started scap: Backport for Start reading from af_actor/afh_actor everywhere (T355616)
  • 19:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P55837 and previous config saved to /var/cache/conftool/dbconfig/20240129-191748-marostegui.json
  • 19:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P55836 and previous config saved to /var/cache/conftool/dbconfig/20240129-190241-marostegui.json
  • 19:01 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:01 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:00 ayounsi@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: CR993089 - ayounsi@cumin1002
  • 18:59 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:59 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:58 ayounsi@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: CR993089 - ayounsi@cumin1002
  • 18:49 brouberol@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop test cluster: Restart of jvm daemons.
  • 18:49 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:49 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55835 and previous config saved to /var/cache/conftool/dbconfig/20240129-184735-marostegui.json
  • 18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55834 and previous config saved to /var/cache/conftool/dbconfig/20240129-182909-marostegui.json
  • 18:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 18:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 18:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55833 and previous config saved to /var/cache/conftool/dbconfig/20240129-182846-marostegui.json
  • 18:24 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 18:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P55832 and previous config saved to /var/cache/conftool/dbconfig/20240129-181340-marostegui.json
  • 17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P55831 and previous config saved to /var/cache/conftool/dbconfig/20240129-175833-marostegui.json
  • 17:43 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 17:43 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 17:43 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55830 and previous config saved to /var/cache/conftool/dbconfig/20240129-174327-marostegui.json
  • 17:43 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 17:42 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 17:42 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 17:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55829 and previous config saved to /var/cache/conftool/dbconfig/20240129-173435-marostegui.json
  • 17:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 17:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 17:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 17:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 17:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T355609)', diff saved to https://phabricator.wikimedia.org/P55828 and previous config saved to /var/cache/conftool/dbconfig/20240129-173406-marostegui.json
  • 17:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P55824 and previous config saved to /var/cache/conftool/dbconfig/20240129-171859-marostegui.json
  • 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P55823 and previous config saved to /var/cache/conftool/dbconfig/20240129-170353-marostegui.json
  • 16:51 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 06m 37s)
  • 16:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T355609)', diff saved to https://phabricator.wikimedia.org/P55822 and previous config saved to /var/cache/conftool/dbconfig/20240129-164846-marostegui.json
  • 16:44 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 07m 04s)
  • 16:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T355609)', diff saved to https://phabricator.wikimedia.org/P55821 and previous config saved to /var/cache/conftool/dbconfig/20240129-164005-marostegui.json
  • 16:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 16:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 16:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 16:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 16:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T355609)', diff saved to https://phabricator.wikimedia.org/P55820 and previous config saved to /var/cache/conftool/dbconfig/20240129-163926-marostegui.json
  • 16:36 volans: installed spicerack 8.3.0 on cumin1002, cumin1001
  • 16:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P55819 and previous config saved to /var/cache/conftool/dbconfig/20240129-162420-marostegui.json
  • 16:20 ladsgroup@deploy2002: Finished scap: Backport for Drop old virtual domain for url shortener (duration: 09m 24s)
  • 16:14 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 16:12 ladsgroup@deploy2002: ladsgroup: Backport for Drop old virtual domain for url shortener synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:11 ladsgroup@deploy2002: Started scap: Backport for Drop old virtual domain for url shortener
  • 16:10 urandom: decommissioning restbase2019/cassandra-{a,b,c} — T352469
  • 16:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P55817 and previous config saved to /var/cache/conftool/dbconfig/20240129-160913-marostegui.json
  • 16:08 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2019.codfw.wmnet with reason: Decommissioning — T352469
  • 16:07 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2019.codfw.wmnet with reason: Decommissioning — T352469
  • 15:58 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-tool1009.eqiad.wmnet with OS buster
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T355609)', diff saved to https://phabricator.wikimedia.org/P55816 and previous config saved to /var/cache/conftool/dbconfig/20240129-155406-marostegui.json
  • 15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T355609)', diff saved to https://phabricator.wikimedia.org/P55815 and previous config saved to /var/cache/conftool/dbconfig/20240129-154444-marostegui.json
  • 15:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 15:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T355609)', diff saved to https://phabricator.wikimedia.org/P55814 and previous config saved to /var/cache/conftool/dbconfig/20240129-154422-marostegui.json
  • 15:34 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
  • 15:31 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
  • 15:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P55811 and previous config saved to /var/cache/conftool/dbconfig/20240129-152915-marostegui.json
  • 15:26 Dreamy_Jazz: Running MediaModeration scanning script using `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt` on a tmux session.
  • 15:24 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 15:23 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 15:21 Dreamy_Jazz: Running `foreachwikiindblist group1.dblist extensions/MediaModeration/maintenance/resendMatchEmails.php 20200405 --verbose`
  • 15:19 Dreamy_Jazz: Running `foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/resendMatchEmails.php 20200405`
  • 15:17 Dreamy_Jazz: Stopping mediamoderation scanning script
  • 15:17 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-tool1009.eqiad.wmnet with OS buster
  • 15:15 Dreamy_Jazz: afternoon UTC backport window done
  • 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P55810 and previous config saved to /var/cache/conftool/dbconfig/20240129-151409-marostegui.json
  • 15:14 dreamyjazz@deploy2002: Finished scap: Backport for Make the email subject unique for positive match emails (T355752) (duration: 21m 21s)
  • 15:13 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts sretest1005.eqiad.wmnet
  • 15:13 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:13 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
  • 15:12 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
  • 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1006.eqiad.wmnet
  • 15:04 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1001.eqiad.wmnet
  • 15:04 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 15:04 dreamyjazz@deploy2002: dreamyjazz: Backport for Make the email subject unique for positive match emails (T355752) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:00 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
  • 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T355609)', diff saved to https://phabricator.wikimedia.org/P55809 and previous config saved to /var/cache/conftool/dbconfig/20240129-145902-marostegui.json
  • 14:58 hashar@deploy2002: Finished deploy [gerrit/gerrit@5594608]: wm-checks-api: direct link to build when only one failed - T355774 (duration: 00m 07s)
  • 14:58 hashar@deploy2002: Started deploy [gerrit/gerrit@5594608]: wm-checks-api: direct link to build when only one failed - T355774
  • 14:57 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp1001.eqiad.wmnet
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2129 (T355609)', diff saved to https://phabricator.wikimedia.org/P55808 and previous config saved to /var/cache/conftool/dbconfig/20240129-145652-marostegui.json
  • 14:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 14:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 14:56 ayounsi@cumin2002: START - Cookbook sre.hosts.decommission for hosts sretest1005.eqiad.wmnet
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T355609)', diff saved to https://phabricator.wikimedia.org/P55807 and previous config saved to /var/cache/conftool/dbconfig/20240129-145630-marostegui.json
  • 14:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2055.codfw.wmnet
  • 14:54 Dreamy_Jazz: scap backport is also backporting 993499 for T355357
  • 14:53 ayounsi@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host sretest1005.eqiad.wmnet
  • 14:53 ayounsi@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
  • 14:52 dreamyjazz@deploy2002: Started scap: Backport for Make the email subject unique for positive match emails (T355752)
  • 14:52 ayounsi@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
  • 14:51 dreamyjazz@deploy2002: sync-world aborted: Backport for Make the email subject unique for positive match emails (T355752) (duration: 04m 13s)
  • 14:51 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
  • 14:50 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2055.codfw.wmnet
  • 14:50 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
  • 14:49 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest1005.eqiad.wmnet on all recursors
  • 14:49 ayounsi@cumin2002: START - Cookbook sre.dns.wipe-cache sretest1005.eqiad.wmnet on all recursors
  • 14:49 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:49 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
  • 14:48 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
  • 14:47 dreamyjazz@deploy2002: Started scap: Backport for Make the email subject unique for positive match emails (T355752)
  • 14:46 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for hewikinews: remove wgExtraGenderNamespaces and add wgNamespaceAliases (T349581) (duration: 12m 29s)
  • 14:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-airflow1006.eqiad.wmnet
  • 14:42 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
  • 14:42 ayounsi@cumin2002: START - Cookbook sre.ganeti.makevm for new host sretest1005.eqiad.wmnet
  • 14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P55806 and previous config saved to /var/cache/conftool/dbconfig/20240129-144124-marostegui.json
  • 14:40 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::analytics_product
  • 14:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Continuing with sync
  • 14:37 brouberol@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-tool1009.eqiad.wmnet with OS bullseye
  • 14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Backport for hewikinews: remove wgExtraGenderNamespaces and add wgNamespaceAliases (T349581) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:34 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for hewikinews: remove wgExtraGenderNamespaces and add wgNamespaceAliases (T349581)
  • 14:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::analytics_product
  • 14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for knwiki: add portal namespace and fix talkpagenames of draft and module namespace (T355662 T346583) (duration: 08m 58s)
  • 14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P55804 and previous config saved to /var/cache/conftool/dbconfig/20240129-142617-marostegui.json
  • 14:23 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ceph2001.codfw.wmnet with OS bullseye
  • 14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Continuing with sync
  • 14:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Backport for knwiki: add portal namespace and fix talkpagenames of draft and module namespace (T355662 T346583) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:21 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for knwiki: add portal namespace and fix talkpagenames of draft and module namespace (T355662 T346583)
  • 14:17 volans: upgraded spicerack to 8.3.0 on cumin2002
  • 14:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for uzwiki: revert temporary logo for the 20th anniversary (T353723) (duration: 11m 01s)
  • 14:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T355609)', diff saved to https://phabricator.wikimedia.org/P55803 and previous config saved to /var/cache/conftool/dbconfig/20240129-141111-marostegui.json
  • 14:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1006.eqiad.wmnet with OS bullseye
  • 14:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Continuing with sync
  • 14:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Backport for uzwiki: revert temporary logo for the 20th anniversary (T353723) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for uzwiki: revert temporary logo for the 20th anniversary (T353723)
  • 14:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T355609)', diff saved to https://phabricator.wikimedia.org/P55802 and previous config saved to /var/cache/conftool/dbconfig/20240129-140205-marostegui.json
  • 14:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 14:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 14:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55801 and previous config saved to /var/cache/conftool/dbconfig/20240129-140142-marostegui.json
  • 13:54 volans: uploaded spicerack_8.3.0 to apt.wikimedia.org bullseye-wikimedia
  • 13:48 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2355.codfw.wmnet with OS bullseye
  • 13:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P55799 and previous config saved to /var/cache/conftool/dbconfig/20240129-134636-marostegui.json
  • 13:46 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2445.codfw.wmnet with OS bullseye
  • 13:40 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2429.codfw.wmnet with OS bullseye
  • 13:40 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1006.eqiad.wmnet with reason: host reimage
  • 13:37 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2381.codfw.wmnet with OS bullseye
  • 13:36 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1006.eqiad.wmnet with reason: host reimage
  • 13:35 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2260.codfw.wmnet with OS bullseye
  • 13:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P55798 and previous config saved to /var/cache/conftool/dbconfig/20240129-133129-marostegui.json
  • 13:29 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2355.codfw.wmnet with reason: host reimage
  • 13:26 claime: Restarting ferm.service on k8s node kubernetes2055 - T354855
  • 13:25 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2445.codfw.wmnet with reason: host reimage
  • 13:23 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-airflow1006.eqiad.wmnet with OS bullseye
  • 13:23 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
  • 13:20 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2429.codfw.wmnet with reason: host reimage
  • 13:18 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2381.codfw.wmnet with reason: host reimage
  • 13:17 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2445.codfw.wmnet with reason: host reimage
  • 13:16 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
  • 13:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2429.codfw.wmnet with reason: host reimage
  • 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55797 and previous config saved to /var/cache/conftool/dbconfig/20240129-131623-marostegui.json
  • 13:15 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2260.codfw.wmnet with reason: host reimage
  • 13:14 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2381.codfw.wmnet with reason: host reimage
  • 13:13 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2355.codfw.wmnet with reason: host reimage
  • 13:12 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2260.codfw.wmnet with reason: host reimage
  • 13:07 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-tool1009.eqiad.wmnet with OS bullseye
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55796 and previous config saved to /var/cache/conftool/dbconfig/20240129-130724-marostegui.json
  • 13:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 13:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 13:00 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2445.codfw.wmnet with OS bullseye
  • 12:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2429.codfw.wmnet with OS bullseye
  • 12:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 12:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 12:58 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2381.codfw.wmnet with OS bullseye
  • 12:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 12:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 12:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55795 and previous config saved to /var/cache/conftool/dbconfig/20240129-125726-marostegui.json
  • 12:57 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2355.codfw.wmnet with OS bullseye
  • 12:56 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2260.codfw.wmnet with OS bullseye
  • 12:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P55794 and previous config saved to /var/cache/conftool/dbconfig/20240129-124220-marostegui.json
  • 12:33 moritzm: installing openssh security updates
  • 12:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P55793 and previous config saved to /var/cache/conftool/dbconfig/20240129-122713-marostegui.json
  • 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1007.eqiad.wmnet
  • 12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-airflow1007.eqiad.wmnet
  • 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::wmde
  • 12:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55792 and previous config saved to /var/cache/conftool/dbconfig/20240129-121205-marostegui.json
  • 12:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55791 and previous config saved to /var/cache/conftool/dbconfig/20240129-120628-marostegui.json
  • 12:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 12:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 12:00 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::wmde
  • 12:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 11:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55790 and previous config saved to /var/cache/conftool/dbconfig/20240129-115953-marostegui.json
  • 11:53 Dreamy_Jazz: Running mwscript maintenance/sql.php --wiki=testwiki --wikidb=centralauth ~/T354700-create-table-global.sql for T354700
  • 11:45 Dreamy_Jazz: sql.php finished for T354700
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P55789 and previous config saved to /var/cache/conftool/dbconfig/20240129-114446-marostegui.json
  • 11:41 Dreamy_Jazz: T354700 - Running `foreachwiki maintenance/sql.php ~/T354700-create-table.sql`
  • 11:39 Dreamy_Jazz: T354700 - Ran mwscript maintenance/sql.php --wiki=testwiki ~/T354700-create-table.sql
  • 11:38 moritzm: upload ganeti 3.0.2-3+wmf1 (bookworm package of Ganeti plus backport for SSL chain handling in RAPI) to apt.wikimedia.org T300152
  • 11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P55788 and previous config saved to /var/cache/conftool/dbconfig/20240129-112940-marostegui.json
  • 11:28 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1007.eqiad.wmnet with OS bullseye
  • 11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55787 and previous config saved to /var/cache/conftool/dbconfig/20240129-111434-marostegui.json
  • 11:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55786 and previous config saved to /var/cache/conftool/dbconfig/20240129-110955-marostegui.json
  • 11:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 11:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 11:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55785 and previous config saved to /var/cache/conftool/dbconfig/20240129-110933-marostegui.json
  • 11:05 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1007.eqiad.wmnet with reason: host reimage
  • 11:01 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1007.eqiad.wmnet with reason: host reimage
  • 10:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P55784 and previous config saved to /var/cache/conftool/dbconfig/20240129-105427-marostegui.json
  • 10:53 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1054.eqiad.wmnet
  • 10:53 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2054.codfw.wmnet
  • 10:47 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2054.codfw.wmnet
  • 10:47 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1054.eqiad.wmnet
  • 10:47 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-airflow1007.eqiad.wmnet with OS bullseye
  • 10:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P55783 and previous config saved to /var/cache/conftool/dbconfig/20240129-103920-marostegui.json
  • 10:38 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 10:37 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 10:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55782 and previous config saved to /var/cache/conftool/dbconfig/20240129-102414-marostegui.json
  • 10:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55781 and previous config saved to /var/cache/conftool/dbconfig/20240129-101757-marostegui.json
  • 10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T355609)', diff saved to https://phabricator.wikimedia.org/P55780 and previous config saved to /var/cache/conftool/dbconfig/20240129-101735-marostegui.json
  • 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P55779 and previous config saved to /var/cache/conftool/dbconfig/20240129-100229-marostegui.json
  • 10:01 moritzm: upload prometheus-ganeti-exporter 0.3+deb12u1 to apt.wikimedia.org T300152
  • 09:56 XioNoX: enable Puppet on all the ganeti servers for CR990968 deployment - T300152
  • 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P55778 and previous config saved to /var/cache/conftool/dbconfig/20240129-094722-marostegui.json
  • 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T355609)', diff saved to https://phabricator.wikimedia.org/P55777 and previous config saved to /var/cache/conftool/dbconfig/20240129-093216-marostegui.json
  • 09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T355609)', diff saved to https://phabricator.wikimedia.org/P55776 and previous config saved to /var/cache/conftool/dbconfig/20240129-092724-marostegui.json
  • 09:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 09:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T355609)', diff saved to https://phabricator.wikimedia.org/P55775 and previous config saved to /var/cache/conftool/dbconfig/20240129-092702-marostegui.json
  • 09:17 godog: mark for deletetion and cleanup replicated thanos blocks for prometheus=ops, older than 3 months, all resolutions - T351927
  • 09:13 moritzm: upgrading python-pymysql in S7 DB hosts to 1.0.2-2~wmf11u1 T355531
  • 09:13 XioNoX: disable Puppet on all the ganeti servers for CR990968 deployment - T300152
  • 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P55773 and previous config saved to /var/cache/conftool/dbconfig/20240129-091156-marostegui.json
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P55772 and previous config saved to /var/cache/conftool/dbconfig/20240129-085649-marostegui.json
  • 08:46 marostegui@deploy2002: Finished scap: Backport for Revert "ProductionServices.php: Promote pc2014" (duration: 17m 13s)
  • 08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T355609)', diff saved to https://phabricator.wikimedia.org/P55771 and previous config saved to /var/cache/conftool/dbconfig/20240129-084143-marostegui.json
  • 08:39 marostegui@deploy2002: marostegui: Continuing with sync
  • 08:39 marostegui@deploy2002: marostegui: Backport for Revert "ProductionServices.php: Promote pc2014" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T355609)', diff saved to https://phabricator.wikimedia.org/P55770 and previous config saved to /var/cache/conftool/dbconfig/20240129-083627-marostegui.json
  • 08:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 08:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55769 and previous config saved to /var/cache/conftool/dbconfig/20240129-083603-marostegui.json
  • 08:29 marostegui@deploy2002: Started scap: Backport for Revert "ProductionServices.php: Promote pc2014"
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P55768 and previous config saved to /var/cache/conftool/dbconfig/20240129-082057-marostegui.json
  • 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P55767 and previous config saved to /var/cache/conftool/dbconfig/20240129-080550-marostegui.json
  • 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55766 and previous config saved to /var/cache/conftool/dbconfig/20240129-075044-marostegui.json
  • 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55765 and previous config saved to /var/cache/conftool/dbconfig/20240129-074541-marostegui.json
  • 07:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 07:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T355609)', diff saved to https://phabricator.wikimedia.org/P55764 and previous config saved to /var/cache/conftool/dbconfig/20240129-074519-marostegui.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P55763 and previous config saved to /var/cache/conftool/dbconfig/20240129-073857-root.json
  • 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P55762 and previous config saved to /var/cache/conftool/dbconfig/20240129-073012-marostegui.json
  • 07:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P55761 and previous config saved to /var/cache/conftool/dbconfig/20240129-072352-root.json
  • 07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P55760 and previous config saved to /var/cache/conftool/dbconfig/20240129-071506-marostegui.json
  • 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P55758 and previous config saved to /var/cache/conftool/dbconfig/20240129-070847-root.json
  • 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T355609)', diff saved to https://phabricator.wikimedia.org/P55757 and previous config saved to /var/cache/conftool/dbconfig/20240129-065959-marostegui.json
  • 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T355609)', diff saved to https://phabricator.wikimedia.org/P55756 and previous config saved to /var/cache/conftool/dbconfig/20240129-065450-marostegui.json
  • 06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 06:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T355609)', diff saved to https://phabricator.wikimedia.org/P55755 and previous config saved to /var/cache/conftool/dbconfig/20240129-065427-marostegui.json
  • 06:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P55754 and previous config saved to /var/cache/conftool/dbconfig/20240129-065341-root.json
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P55752 and previous config saved to /var/cache/conftool/dbconfig/20240129-063920-marostegui.json
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P55751 and previous config saved to /var/cache/conftool/dbconfig/20240129-063836-root.json
  • 06:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129', diff saved to https://phabricator.wikimedia.org/P55750 and previous config saved to /var/cache/conftool/dbconfig/20240129-063302-marostegui.json
  • 06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P55747 and previous config saved to /var/cache/conftool/dbconfig/20240129-062414-marostegui.json
  • 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T355609)', diff saved to https://phabricator.wikimedia.org/P55746 and previous config saved to /var/cache/conftool/dbconfig/20240129-060907-marostegui.json
  • 06:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T355609)', diff saved to https://phabricator.wikimedia.org/P55745 and previous config saved to /var/cache/conftool/dbconfig/20240129-060400-marostegui.json
  • 06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 05:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1134.eqiad.wmnet
  • 05:57 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 05:57 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1134.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 05:56 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1134.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
  • 05:54 marostegui@cumin1002: START - Cookbook sre.dns.netbox
  • 05:49 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1134.eqiad.wmnet

2024-01-28

  • 01:11 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2016.codfw.wmnet with reason: Decommissioning — T352469
  • 01:11 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2016.codfw.wmnet with reason: Decommissioning — T352469
  • 01:10 urandom: decommissioning restbase2016/cassandra-{a,b,c} — T352469

2024-01-26

  • 22:07 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host cloudelastic1006.wikimedia.org
  • 22:06 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host cloudelastic1006.wikimedia.org
  • 22:05 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host cloudelastic1006.wikimedia.org
  • 22:04 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host cloudelastic1006.wikimedia.org
  • 19:02 ejegg: fundraising civicrm upgraded from 8c0dc1d2 to b953d667
  • 18:27 mutante: cloudweb1003 - OATHAuth disabled for Triciaburmeister. (after video verification - T355958)
  • 18:16 mutante: phab1004 - removing 2fa from TBurmeister (after video verification) T355958
  • 17:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
  • 17:57 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 17:53 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 17:37 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
  • 17:34 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
  • 17:17 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
  • 17:12 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
  • 17:11 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
  • 17:09 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:09 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sync cloudelastic1010 IPs - bking@cumin2002"
  • 17:08 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sync cloudelastic1010 IPs - bking@cumin2002"
  • 17:04 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 16:34 bking@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudelastic1010.wikimedia.org
  • 16:33 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:33 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1010.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 16:33 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
  • 16:32 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1010.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 16:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2169 in db2194 for T343674', diff saved to https://phabricator.wikimedia.org/P55740 and previous config saved to /var/cache/conftool/dbconfig/20240126-163057-arnaudb.json
  • 16:29 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 16:23 bking@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic1010.wikimedia.org
  • 16:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
  • 15:01 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 15:00 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:47 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:46 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:37 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:37 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:36 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2015.codfw.wmnet with reason: Decommissioning — T352469
  • 14:35 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2015.codfw.wmnet with reason: Decommissioning — T352469
  • 14:34 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:34 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:33 urandom: decommissioning restbase2015/cassandra-{a,b,c} — T352469
  • 14:27 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:27 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:24 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:24 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:08 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 14:08 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 13:18 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Gitlab security upgrade
  • 12:36 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:36 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster svc - ayounsi@cumin1002"
  • 12:35 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster svc - ayounsi@cumin1002"
  • 12:30 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 11:43 taavi: reprepro: copy helm-diff_3.1.3-2 from bullseye-wikimedia to bookworm-wikimedia
  • 11:28 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Gitlab security upgrade
  • 10:52 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 10:51 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
  • 10:50 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Gitlab security upgrade
  • 10:44 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Gitlab security upgrade
  • 10:36 moritzm: prune obsolete nginx packages from eventschema hosts after migration to new library scheme T329529
  • 10:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2169 in db2194 for T343674', diff saved to https://phabricator.wikimedia.org/P55737 and previous config saved to /var/cache/conftool/dbconfig/20240126-102550-arnaudb.json
  • 08:01 moritzm: rebalance codfw/B following switch maintenance T355549
  • 07:54 moritzm: failover ganeti master for codfw back to ganeti2022, switch maintenance is completed T355549
  • 01:01 dzahn@cumin1002: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1004.wikimedia.org with reason: security release
  • 00:07 dzahn@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: security release
  • 00:00 dzahn@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: security release

2024-01-25

  • 23:54 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=wikimaniawiki --fix # T347622
  • 23:54 zabe@deploy2002: Finished scap: Backport for Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622) (duration: 08m 30s)
  • 23:47 zabe@deploy2002: robertsky and zabe: Continuing with sync
  • 23:47 zabe@deploy2002: robertsky and zabe: Backport for Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:45 zabe@deploy2002: Started scap: Backport for Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622)
  • 23:29 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Sturm . # T355485
  • 23:17 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cloudelastic1010.wikimedia.org with reason: migration canary T355617
  • 23:17 bking@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on cloudelastic1010.wikimedia.org with reason: migration canary T355617
  • 22:54 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic1010.wikimedia.org for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
  • 22:53 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010.wikimedia.org for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
  • 22:53 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
  • 22:53 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
  • 22:52 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
  • 22:52 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
  • 22:40 ryankemper: T351354 Restarting `cloudelastic1006` (final restart for today)
  • 22:34 ryankemper: T351354 Now restarting new masters to keep configs in sync; restarting `cloudelastic1009`
  • 22:33 ryankemper: T351354 Now restarting new masters to keep configs in sync; restarting `cloudelastic1007`
  • 22:26 ryankemper: T351354 Restarting `cloudelastic1002`
  • 22:19 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:19 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:15 ryankemper: T351354 Restarting `cloudelastic1004` following puppet run
  • 22:12 dzahn@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: security release
  • 22:11 ryankemper: T351354 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/993038; restarting `cloudelastic1001` following puppet run
  • 22:08 ryankemper: T351354 Downtimed `cloudelastic*`; shortly will restart `cloudelastic100[1,2,4]` one host at a time to make them no longer masters
  • 22:08 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: cloudelastic maintenance
  • 22:07 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: cloudelastic maintenance
  • 21:55 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:55 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:44 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:44 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:44 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:44 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:19 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:19 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:14 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:14 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:13 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:13 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:58 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:58 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:57 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:57 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:56 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:56 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:55 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1002.eqiad.wmnet with OS bookworm
  • 20:55 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
  • 20:54 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
  • 20:51 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1001.eqiad.wmnet with OS bookworm
  • 20:51 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
  • 20:50 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
  • 20:37 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:37 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:36 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1002.eqiad.wmnet with reason: host reimage
  • 20:35 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:35 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1002.eqiad.wmnet with reason: host reimage
  • 20:32 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1001.eqiad.wmnet with reason: host reimage
  • 20:27 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1001.eqiad.wmnet with reason: host reimage
  • 20:26 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set cloudrabbit1001/2 as active - taavi@cumin1002"
  • 20:25 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set cloudrabbit1001/2 as active - taavi@cumin1002"
  • 20:19 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1002.eqiad.wmnet with OS bookworm
  • 20:19 taavi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudrabbit1002.eqiad.wmnet with OS bookworm
  • 20:16 zabe@deploy2002: Finished scap: Backport for Start reading from af_actor/afh_actor in group1 wikis (T355616) (duration: 11m 27s)
  • 20:15 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:15 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:11 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1002.eqiad.wmnet with OS bookworm
  • 20:10 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1001.eqiad.wmnet with OS bookworm
  • 20:10 zabe@deploy2002: zabe: Continuing with sync
  • 20:09 zabe@deploy2002: zabe: Backport for Start reading from af_actor/afh_actor in group1 wikis (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:06 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudrabbit1001
  • 20:05 zabe@deploy2002: Started scap: Backport for Start reading from af_actor/afh_actor in group1 wikis (T355616)
  • 20:05 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudrabbit1001
  • 20:05 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:05 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1001 - taavi@cumin1002"
  • 20:04 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1001 - taavi@cumin1002"
  • 20:02 taavi@cumin1002: START - Cookbook sre.dns.netbox
  • 20:01 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudrabbit1002
  • 20:00 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudrabbit1002
  • 19:59 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:59 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1002 - taavi@cumin1002"
  • 19:58 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1002 - taavi@cumin1002"
  • 19:56 taavi@cumin1002: START - Cookbook sre.dns.netbox
  • 19:29 bking@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 19:29 bking@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 19:28 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:28 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:25 bking@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 19:24 bking@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 18:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55736 and previous config saved to /var/cache/conftool/dbconfig/20240125-184922-root.json
  • 18:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55735 and previous config saved to /var/cache/conftool/dbconfig/20240125-184917-root.json
  • 18:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55734 and previous config saved to /var/cache/conftool/dbconfig/20240125-184911-root.json
  • 18:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55733 and previous config saved to /var/cache/conftool/dbconfig/20240125-184906-root.json
  • 18:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55732 and previous config saved to /var/cache/conftool/dbconfig/20240125-184900-root.json
  • 18:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55731 and previous config saved to /var/cache/conftool/dbconfig/20240125-184853-root.json
  • 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55730 and previous config saved to /var/cache/conftool/dbconfig/20240125-184845-root.json
  • 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55729 and previous config saved to /var/cache/conftool/dbconfig/20240125-184839-root.json
  • 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55728 and previous config saved to /var/cache/conftool/dbconfig/20240125-184823-root.json
  • 18:47 mutante: phab2002 - rebooting
  • 18:46 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: reboot
  • 18:45 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: reboot
  • 18:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55727 and previous config saved to /var/cache/conftool/dbconfig/20240125-183417-root.json
  • 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55726 and previous config saved to /var/cache/conftool/dbconfig/20240125-183412-root.json
  • 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55725 and previous config saved to /var/cache/conftool/dbconfig/20240125-183406-root.json
  • 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55724 and previous config saved to /var/cache/conftool/dbconfig/20240125-183401-root.json
  • 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55723 and previous config saved to /var/cache/conftool/dbconfig/20240125-183355-root.json
  • 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55722 and previous config saved to /var/cache/conftool/dbconfig/20240125-183348-root.json
  • 18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55721 and previous config saved to /var/cache/conftool/dbconfig/20240125-183340-root.json
  • 18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55720 and previous config saved to /var/cache/conftool/dbconfig/20240125-183334-root.json
  • 18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55719 and previous config saved to /var/cache/conftool/dbconfig/20240125-183318-root.json
  • 18:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55718 and previous config saved to /var/cache/conftool/dbconfig/20240125-181912-root.json
  • 18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55717 and previous config saved to /var/cache/conftool/dbconfig/20240125-181907-root.json
  • 18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55716 and previous config saved to /var/cache/conftool/dbconfig/20240125-181901-root.json
  • 18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55715 and previous config saved to /var/cache/conftool/dbconfig/20240125-181856-root.json
  • 18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55714 and previous config saved to /var/cache/conftool/dbconfig/20240125-181850-root.json
  • 18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55713 and previous config saved to /var/cache/conftool/dbconfig/20240125-181843-root.json
  • 18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55712 and previous config saved to /var/cache/conftool/dbconfig/20240125-181835-root.json
  • 18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55711 and previous config saved to /var/cache/conftool/dbconfig/20240125-181829-root.json
  • 18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55710 and previous config saved to /var/cache/conftool/dbconfig/20240125-181814-root.json
  • 18:13 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum6001.drmrs.wmnet with OS bookworm
  • 18:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55709 and previous config saved to /var/cache/conftool/dbconfig/20240125-180407-root.json
  • 18:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55708 and previous config saved to /var/cache/conftool/dbconfig/20240125-180402-root.json
  • 18:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55707 and previous config saved to /var/cache/conftool/dbconfig/20240125-180356-root.json
  • 18:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55706 and previous config saved to /var/cache/conftool/dbconfig/20240125-180351-root.json
  • 18:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55705 and previous config saved to /var/cache/conftool/dbconfig/20240125-180345-root.json
  • 18:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55704 and previous config saved to /var/cache/conftool/dbconfig/20240125-180338-root.json
  • 18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55703 and previous config saved to /var/cache/conftool/dbconfig/20240125-180330-root.json
  • 18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55702 and previous config saved to /var/cache/conftool/dbconfig/20240125-180324-root.json
  • 18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55701 and previous config saved to /var/cache/conftool/dbconfig/20240125-180308-root.json
  • 18:01 sukhe: running authdns-update for CR 993008: T355835
  • 17:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 17:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55700 and previous config saved to /var/cache/conftool/dbconfig/20240125-174902-root.json
  • 17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55699 and previous config saved to /var/cache/conftool/dbconfig/20240125-174857-root.json
  • 17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55698 and previous config saved to /var/cache/conftool/dbconfig/20240125-174851-root.json
  • 17:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55697 and previous config saved to /var/cache/conftool/dbconfig/20240125-174846-root.json
  • 17:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55696 and previous config saved to /var/cache/conftool/dbconfig/20240125-174840-root.json
  • 17:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55695 and previous config saved to /var/cache/conftool/dbconfig/20240125-174833-root.json
  • 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55694 and previous config saved to /var/cache/conftool/dbconfig/20240125-174825-root.json
  • 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55693 and previous config saved to /var/cache/conftool/dbconfig/20240125-174819-root.json
  • 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55692 and previous config saved to /var/cache/conftool/dbconfig/20240125-174803-root.json
  • 17:47 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum6001.drmrs.wmnet with reason: host reimage
  • 17:45 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for asw-b-codfw,lsw1-b5-codfw.mgmt
  • 17:45 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for asw-b-codfw,lsw1-b5-codfw.mgmt
  • 17:43 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum6001.drmrs.wmnet with reason: host reimage
  • 17:38 btullis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
  • 17:34 btullis@deploy2002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
  • 17:33 btullis@deploy2002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
  • 17:30 Amir1: deploying new captchas (T141490)
  • 17:22 btullis@deploy2002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
  • 17:22 btullis@deploy2002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
  • 17:21 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host durum6001.drmrs.wmnet with OS bookworm
  • 17:17 btullis@deploy2002: helmfile [staging] START helmfile.d/services/datahub: apply on main
  • 17:09 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:09 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:07 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:05 taavi@cumin1002: START - Cookbook sre.dns.netbox
  • 17:04 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudrabbit[1001-1002].wikimedia.org
  • 17:04 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:04 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit[1001-1002].wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
  • 17:01 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 17:01 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
  • 17:00 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit[1001-1002].wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
  • 16:56 taavi@cumin1002: START - Cookbook sre.dns.netbox
  • 16:52 sukhe: running authdns-update for CR 992936: T355835
  • 16:49 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2014.codfw.wmnet with reason: Decommissioning — T352469
  • 16:49 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2014.codfw.wmnet with reason: Decommissioning — T352469
  • 16:48 taavi@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudrabbit[1001-1002].wikimedia.org
  • 16:48 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
  • 16:48 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
  • 16:43 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 32 hosts
  • 16:42 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for 32 hosts
  • 16:42 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr[1-2]-codfw
  • 16:41 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for cr[1-2]-codfw
  • 16:34 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=parse2007.codfw.wmnet
  • 16:34 claime: repooling parse2007 - T355549
  • 16:33 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=parse2006.codfw.wmnet
  • 16:33 claime: repooling parse2006 - T355549
  • 16:32 claime: uncordoning kubernetes2023 - T355549
  • 16:32 claime: uncordoning kubernetes2032 - T355549
  • 16:29 claime: uncordoning kubernetes2031 - T355549
  • 16:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T354336)', diff saved to https://phabricator.wikimedia.org/P55691 and previous config saved to /var/cache/conftool/dbconfig/20240125-161320-marostegui.json
  • 16:03 topranks: Network maintenance codfw rack b5 underway T355549
  • 15:58 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on 32 hosts with reason: Migrating servers in codfw rack B5 to lsw1-b5-codfw T355549
  • 15:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55690 and previous config saved to /var/cache/conftool/dbconfig/20240125-155813-marostegui.json
  • 15:58 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:30:00 on 32 hosts with reason: Migrating servers in codfw rack B5 to lsw1-b5-codfw T355549
  • 15:57 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on cr[1-2]-codfw with reason: prepping for server uplink migration
  • 15:57 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:30:00 on cr[1-2]-codfw with reason: prepping for server uplink migration
  • 15:54 arnaudb@cumin1002: dbctl commit (dc=all): 'preparing to clone db2169 on db2196 as per TT343674', diff saved to https://phabricator.wikimedia.org/P55689 and previous config saved to /var/cache/conftool/dbconfig/20240125-155450-arnaudb.json
  • 15:52 topranks: disabling puppet fleet-wide to allow for maintenance in codfw rack b5 which hosts puppetmaster2003 T355549
  • 15:46 topranks: configuring lsw1-b5-codfw switch ports for servers to be moved T355549
  • 15:46 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on asw-b-codfw,lsw1-b5-codfw.mgmt with reason: prepping for server uplink migration
  • 15:46 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on asw-b-codfw,lsw1-b5-codfw.mgmt with reason: prepping for server uplink migration
  • 15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55688 and previous config saved to /var/cache/conftool/dbconfig/20240125-154307-marostegui.json
  • 15:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wcqs::public
  • 15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T354336)', diff saved to https://phabricator.wikimedia.org/P55687 and previous config saved to /var/cache/conftool/dbconfig/20240125-152801-marostegui.json
  • 15:25 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wcqs::public
  • 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wdqs::internal
  • 15:20 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2006.cofw.wmnet
  • 15:19 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 15:18 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
  • 15:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wdqs::internal
  • 14:35 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=parse2007.codfw.wmnet
  • 14:35 claime: Depooling parse2007 (setting inactive) - T355549
  • 14:34 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=parse2006.codfw.wmnet
  • 14:34 claime: Depooling parse2006 (setting inactive) - T355549
  • 14:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2179 (T354336)', diff saved to https://phabricator.wikimedia.org/P55684 and previous config saved to /var/cache/conftool/dbconfig/20240125-142729-marostegui.json
  • 14:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 14:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 14:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55683 and previous config saved to /var/cache/conftool/dbconfig/20240125-142706-marostegui.json
  • 14:26 moritzm: installing debmonitor-client 0.3.4 fleet-wide
  • 14:25 claime: Draining kubernetes2023 - T355549
  • 14:25 claime: Draining kubernetes2033 - T355549
  • 14:23 claime: Draining kubernetes2032 - T355549
  • 14:21 claime: Draining kubernetes2031 - T355549
  • 14:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: After T355885', diff saved to https://phabricator.wikimedia.org/P55682 and previous config saved to /var/cache/conftool/dbconfig/20240125-142102-root.json
  • 14:18 btullis@cumin1002: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 14:15 moritzm: failover ganeti master for codfw to ganeti2020 T355549
  • 14:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55681 and previous config saved to /var/cache/conftool/dbconfig/20240125-141200-marostegui.json
  • 14:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: After T355885', diff saved to https://phabricator.wikimedia.org/P55680 and previous config saved to /var/cache/conftool/dbconfig/20240125-140557-root.json
  • 14:05 btullis@cumin1002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 13:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55679 and previous config saved to /var/cache/conftool/dbconfig/20240125-135653-marostegui.json
  • 13:53 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid test cluster: Roll restart of Druid jvm daemons.
  • 13:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: After T355885', diff saved to https://phabricator.wikimedia.org/P55678 and previous config saved to /var/cache/conftool/dbconfig/20240125-135052-root.json
  • 13:47 volans: uploaded debmonitor-client_0.3.4 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia,bookworm-wikimedia
  • 13:43 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid test cluster: Roll restart of Druid jvm daemons.
  • 13:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55677 and previous config saved to /var/cache/conftool/dbconfig/20240125-134147-marostegui.json
  • 13:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55676 and previous config saved to /var/cache/conftool/dbconfig/20240125-133935-marostegui.json
  • 13:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 13:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 13:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T354336)', diff saved to https://phabricator.wikimedia.org/P55675 and previous config saved to /var/cache/conftool/dbconfig/20240125-133913-marostegui.json
  • 13:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: After T355885', diff saved to https://phabricator.wikimedia.org/P55674 and previous config saved to /var/cache/conftool/dbconfig/20240125-133547-root.json
  • 13:32 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2022.codfw.wmnet
  • 13:28 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2357.codfw.wmnet with OS bullseye
  • 13:28 topranks: draining VMs from ganeti2022 ahead of codfw rack b5 maintenance T355549
  • 13:27 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2022.codfw.wmnet
  • 13:27 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
  • 13:26 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
  • 13:26 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
  • 13:26 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
  • 13:25 topranks: stopping logstash service on logstash2025 to faciliate VM migration T355549
  • 13:25 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
  • 13:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55673 and previous config saved to /var/cache/conftool/dbconfig/20240125-132407-marostegui.json
  • 13:24 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2267.codfw.wmnet with OS bullseye
  • 13:21 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2395.codfw.wmnet with OS bullseye
  • 13:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: After T355885', diff saved to https://phabricator.wikimedia.org/P55672 and previous config saved to /var/cache/conftool/dbconfig/20240125-132043-root.json
  • 13:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 13:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 13:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129', diff saved to https://phabricator.wikimedia.org/P55671 and previous config saved to /var/cache/conftool/dbconfig/20240125-131547-marostegui.json
  • 13:12 hashar@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.15 refs T354433
  • 13:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55670 and previous config saved to /var/cache/conftool/dbconfig/20240125-130900-marostegui.json
  • 13:08 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
  • 13:05 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
  • 13:02 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
  • 13:02 topranks: draining VMs from ganeti2021 ahead of codfw rack b5 maintenance T355549
  • 13:02 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
  • 12:58 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
  • 12:58 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
  • 12:57 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
  • 12:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T354336)', diff saved to https://phabricator.wikimedia.org/P55669 and previous config saved to /var/cache/conftool/dbconfig/20240125-125353-marostegui.json
  • 12:41 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2267.codfw.wmnet with OS bullseye
  • 12:41 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2395.codfw.wmnet with OS bullseye
  • 12:41 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2357.codfw.wmnet with OS bullseye
  • 12:12 jgiannelos@deploy2002: Finished deploy [restbase/deploy@708f0f3]: (no justification provided) (duration: 20m 28s)
  • 12:06 moritzm: installing openssh security updates
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T354336)', diff saved to https://phabricator.wikimedia.org/P55667 and previous config saved to /var/cache/conftool/dbconfig/20240125-115322-marostegui.json
  • 11:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 11:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 11:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T354336)', diff saved to https://phabricator.wikimedia.org/P55666 and previous config saved to /var/cache/conftool/dbconfig/20240125-115233-marostegui.json
  • 11:52 jgiannelos@deploy2002: Started deploy [restbase/deploy@708f0f3]: (no justification provided)
  • 11:45 zabe@deploy2002: Finished scap: Backport for Start reading from af_actor/afh_actor in group0 wikis (T355616) (duration: 08m 25s)
  • 11:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1038.eqiad.wmnet to cluster eqiad and group D
  • 11:42 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1038.eqiad.wmnet to cluster eqiad and group D
  • 11:38 zabe@deploy2002: zabe: Continuing with sync
  • 11:38 zabe@deploy2002: zabe: Backport for Start reading from af_actor/afh_actor in group0 wikis (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55665 and previous config saved to /var/cache/conftool/dbconfig/20240125-113727-marostegui.json
  • 11:36 zabe@deploy2002: Started scap: Backport for Start reading from af_actor/afh_actor in group0 wikis (T355616)
  • 11:29 hashar@deploy2002: Finished scap: Backport for UserGroupManager: Fix cross-wiki database access (T355813) (duration: 08m 50s)
  • 11:26 claime: Restarting ferm.service on k8s node kubernetes2036.codfw.wmnet - T354855
  • 11:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance
  • 11:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance
  • 11:23 hashar@deploy2002: hashar and zabe: Continuing with sync
  • 11:22 hashar@deploy2002: hashar and zabe: Backport for UserGroupManager: Fix cross-wiki database access (T355813) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55664 and previous config saved to /var/cache/conftool/dbconfig/20240125-112220-marostegui.json
  • 11:20 hashar@deploy2002: Started scap: Backport for UserGroupManager: Fix cross-wiki database access (T355813)
  • 11:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T354336)', diff saved to https://phabricator.wikimedia.org/P55663 and previous config saved to /var/cache/conftool/dbconfig/20240125-110714-marostegui.json
  • 11:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 11:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 11:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 11:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55662 and previous config saved to /var/cache/conftool/dbconfig/20240125-110521-marostegui.json
  • 10:57 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
  • 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55660 and previous config saved to /var/cache/conftool/dbconfig/20240125-105015-marostegui.json
  • 10:39 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
  • 10:38 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 10:35 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
  • 10:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55659 and previous config saved to /var/cache/conftool/dbconfig/20240125-103509-marostegui.json
  • 10:21 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
  • 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55658 and previous config saved to /var/cache/conftool/dbconfig/20240125-102002-marostegui.json
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55657 and previous config saved to /var/cache/conftool/dbconfig/20240125-101750-marostegui.json
  • 10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55656 and previous config saved to /var/cache/conftool/dbconfig/20240125-101728-marostegui.json
  • 10:17 moritzm: upgrading python-pymysql in S6 DB hosts to 1.0.2-2~wmf11u1 T355531
  • 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55655 and previous config saved to /var/cache/conftool/dbconfig/20240125-100221-marostegui.json
  • 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55654 and previous config saved to /var/cache/conftool/dbconfig/20240125-094714-marostegui.json
  • 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55653 and previous config saved to /var/cache/conftool/dbconfig/20240125-093208-marostegui.json
  • 09:29 stran@deploy2002: Finished scap: Backport for PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928) (duration: 17m 24s)
  • 09:18 stran@deploy2002: kharlan and stran: Continuing with sync
  • 09:14 stran@deploy2002: kharlan and stran: Backport for PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:12 stran@deploy2002: Started scap: Backport for PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928)
  • 08:45 stran@deploy2002: stran and kharlan: Backport for PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 08:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 08:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 08:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T354336)', diff saved to https://phabricator.wikimedia.org/P55652 and previous config saved to /var/cache/conftool/dbconfig/20240125-083106-marostegui.json
  • 08:16 stran@deploy2002: Started scap: Backport for PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928)
  • 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55651 and previous config saved to /var/cache/conftool/dbconfig/20240125-081559-marostegui.json
  • 08:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55650 and previous config saved to /var/cache/conftool/dbconfig/20240125-080053-marostegui.json
  • 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T354336)', diff saved to https://phabricator.wikimedia.org/P55648 and previous config saved to /var/cache/conftool/dbconfig/20240125-074546-marostegui.json
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2119 (T354336)', diff saved to https://phabricator.wikimedia.org/P55647 and previous config saved to /var/cache/conftool/dbconfig/20240125-074334-marostegui.json
  • 07:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T354336)', diff saved to https://phabricator.wikimedia.org/P55646 and previous config saved to /var/cache/conftool/dbconfig/20240125-074312-marostegui.json
  • 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55645 and previous config saved to /var/cache/conftool/dbconfig/20240125-073319-root.json
  • 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55644 and previous config saved to /var/cache/conftool/dbconfig/20240125-073310-root.json
  • 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55643 and previous config saved to /var/cache/conftool/dbconfig/20240125-073252-root.json
  • 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55642 and previous config saved to /var/cache/conftool/dbconfig/20240125-073244-root.json
  • 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55641 and previous config saved to /var/cache/conftool/dbconfig/20240125-072806-marostegui.json
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2137:3315 T355549', diff saved to https://phabricator.wikimedia.org/P55640 and previous config saved to /var/cache/conftool/dbconfig/20240125-072010-marostegui.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55639 and previous config saved to /var/cache/conftool/dbconfig/20240125-071813-root.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55638 and previous config saved to /var/cache/conftool/dbconfig/20240125-071805-root.json
  • 07:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55637 and previous config saved to /var/cache/conftool/dbconfig/20240125-071747-root.json
  • 07:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55636 and previous config saved to /var/cache/conftool/dbconfig/20240125-071739-root.json
  • 07:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55635 and previous config saved to /var/cache/conftool/dbconfig/20240125-071259-marostegui.json
  • 07:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 db2160 db2109 db2107 db2137:3314 db2135:3315 db2143 db2147 db2177 db2178 db2188 T355549', diff saved to https://phabricator.wikimedia.org/P55634 and previous config saved to /var/cache/conftool/dbconfig/20240125-071253-marostegui.json
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2107 T355682', diff saved to https://phabricator.wikimedia.org/P55633 and previous config saved to /var/cache/conftool/dbconfig/20240125-070604-marostegui.json
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55632 and previous config saved to /var/cache/conftool/dbconfig/20240125-070308-root.json
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55631 and previous config saved to /var/cache/conftool/dbconfig/20240125-070300-root.json
  • 07:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55630 and previous config saved to /var/cache/conftool/dbconfig/20240125-070242-root.json
  • 07:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55629 and previous config saved to /var/cache/conftool/dbconfig/20240125-070234-root.json
  • 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2104 to s2 primary and set section read-write T355682', diff saved to https://phabricator.wikimedia.org/P55628 and previous config saved to /var/cache/conftool/dbconfig/20240125-070153-marostegui.json
  • 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'Set s2 codfw as read-only for maintenance - T355682', diff saved to https://phabricator.wikimedia.org/P55627 and previous config saved to /var/cache/conftool/dbconfig/20240125-070120-marostegui.json
  • 07:00 marostegui: Starting s2 codfw failover from db2107 to db2104 - T355682
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T354336)', diff saved to https://phabricator.wikimedia.org/P55626 and previous config saved to /var/cache/conftool/dbconfig/20240125-065535-marostegui.json
  • 06:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55625 and previous config saved to /var/cache/conftool/dbconfig/20240125-064803-root.json
  • 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55624 and previous config saved to /var/cache/conftool/dbconfig/20240125-064755-root.json
  • 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55623 and previous config saved to /var/cache/conftool/dbconfig/20240125-064737-root.json
  • 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55622 and previous config saved to /var/cache/conftool/dbconfig/20240125-064729-root.json
  • 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2110 (T354336)', diff saved to https://phabricator.wikimedia.org/P55621 and previous config saved to /var/cache/conftool/dbconfig/20240125-064420-marostegui.json
  • 06:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 06:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T354336)', diff saved to https://phabricator.wikimedia.org/P55620 and previous config saved to /var/cache/conftool/dbconfig/20240125-064357-marostegui.json
  • 06:37 marostegui@deploy2002: Finished scap: Backport for ProductionServices.php: Promote pc2014 (T355683) (duration: 08m 42s)
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55619 and previous config saved to /var/cache/conftool/dbconfig/20240125-063258-root.json
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55618 and previous config saved to /var/cache/conftool/dbconfig/20240125-063250-root.json
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55617 and previous config saved to /var/cache/conftool/dbconfig/20240125-063232-root.json
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55616 and previous config saved to /var/cache/conftool/dbconfig/20240125-063225-root.json
  • 06:31 marostegui@deploy2002: marostegui: Continuing with sync
  • 06:31 marostegui@deploy2002: marostegui: Backport for ProductionServices.php: Promote pc2014 (T355683) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 06:29 marostegui@deploy2002: Started scap: Backport for ProductionServices.php: Promote pc2014 (T355683)
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55615 and previous config saved to /var/cache/conftool/dbconfig/20240125-062851-marostegui.json
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55614 and previous config saved to /var/cache/conftool/dbconfig/20240125-061753-root.json
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55613 and previous config saved to /var/cache/conftool/dbconfig/20240125-061745-root.json
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55612 and previous config saved to /var/cache/conftool/dbconfig/20240125-061727-root.json
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55611 and previous config saved to /var/cache/conftool/dbconfig/20240125-061719-root.json
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55610 and previous config saved to /var/cache/conftool/dbconfig/20240125-061344-marostegui.json
  • 06:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s2 T355682
  • 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2104 with weight 0 T355682', diff saved to https://phabricator.wikimedia.org/P55609 and previous config saved to /var/cache/conftool/dbconfig/20240125-061048-root.json
  • 06:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s2 T355682
  • 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55608 and previous config saved to /var/cache/conftool/dbconfig/20240125-060249-root.json
  • 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55607 and previous config saved to /var/cache/conftool/dbconfig/20240125-060240-root.json
  • 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55606 and previous config saved to /var/cache/conftool/dbconfig/20240125-060222-root.json
  • 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55605 and previous config saved to /var/cache/conftool/dbconfig/20240125-060214-root.json
  • 05:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T354336)', diff saved to https://phabricator.wikimedia.org/P55604 and previous config saved to /var/cache/conftool/dbconfig/20240125-055837-marostegui.json
  • 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2106 (T354336)', diff saved to https://phabricator.wikimedia.org/P55603 and previous config saved to /var/cache/conftool/dbconfig/20240125-055626-marostegui.json
  • 05:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 05:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 05:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 05:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 05:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 05:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 02:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 02:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 02:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T354336)', diff saved to https://phabricator.wikimedia.org/P55602 and previous config saved to /var/cache/conftool/dbconfig/20240125-022727-marostegui.json
  • 02:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55601 and previous config saved to /var/cache/conftool/dbconfig/20240125-021221-marostegui.json
  • 01:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55600 and previous config saved to /var/cache/conftool/dbconfig/20240125-015714-marostegui.json
  • 01:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T354336)', diff saved to https://phabricator.wikimedia.org/P55599 and previous config saved to /var/cache/conftool/dbconfig/20240125-014208-marostegui.json
  • 01:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1249 (T354336)', diff saved to https://phabricator.wikimedia.org/P55598 and previous config saved to /var/cache/conftool/dbconfig/20240125-013958-marostegui.json
  • 01:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 01:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 01:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T354336)', diff saved to https://phabricator.wikimedia.org/P55597 and previous config saved to /var/cache/conftool/dbconfig/20240125-013936-marostegui.json
  • 01:28 fab@deploy2002: Finished deploy [airflow-dags/research@e6aa85a]: (no justification provided) (duration: 00m 13s)
  • 01:28 fab@deploy2002: Started deploy [airflow-dags/research@e6aa85a]: (no justification provided)
  • 01:25 eileen: civicrm upgraded from b85b6dde to 69d4ebe3
  • 01:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55596 and previous config saved to /var/cache/conftool/dbconfig/20240125-012430-marostegui.json
  • 01:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55595 and previous config saved to /var/cache/conftool/dbconfig/20240125-010923-marostegui.json
  • 00:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T354336)', diff saved to https://phabricator.wikimedia.org/P55594 and previous config saved to /var/cache/conftool/dbconfig/20240125-005417-marostegui.json
  • 00:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1248 (T354336)', diff saved to https://phabricator.wikimedia.org/P55593 and previous config saved to /var/cache/conftool/dbconfig/20240125-005307-marostegui.json
  • 00:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 00:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 00:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T354336)', diff saved to https://phabricator.wikimedia.org/P55592 and previous config saved to /var/cache/conftool/dbconfig/20240125-005245-marostegui.json
  • 00:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55591 and previous config saved to /var/cache/conftool/dbconfig/20240125-003739-marostegui.json
  • 00:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55590 and previous config saved to /var/cache/conftool/dbconfig/20240125-002233-marostegui.json
  • 00:12 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2103.codfw.wmnet with OS bullseye
  • 00:12 zabe@deploy2002: Finished scap: Backport for Start reading from af_user(_text)/afh_user(_text) in testwiki (T355616) (duration: 09m 36s)
  • 00:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T354336)', diff saved to https://phabricator.wikimedia.org/P55589 and previous config saved to /var/cache/conftool/dbconfig/20240125-000726-marostegui.json
  • 00:05 zabe@deploy2002: zabe: Continuing with sync
  • 00:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1247 (T354336)', diff saved to https://phabricator.wikimedia.org/P55588 and previous config saved to /var/cache/conftool/dbconfig/20240125-000515-marostegui.json
  • 00:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 00:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 00:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T354336)', diff saved to https://phabricator.wikimedia.org/P55587 and previous config saved to /var/cache/conftool/dbconfig/20240125-000452-marostegui.json
  • 00:04 zabe@deploy2002: zabe: Backport for Start reading from af_user(_text)/afh_user(_text) in testwiki (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 00:02 zabe@deploy2002: Started scap: Backport for Start reading from af_user(_text)/afh_user(_text) in testwiki (T355616)

2024-01-24

  • 23:54 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2103.codfw.wmnet with reason: host reimage
  • 23:51 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2103.codfw.wmnet with reason: host reimage
  • 23:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55586 and previous config saved to /var/cache/conftool/dbconfig/20240124-234946-marostegui.json
  • 23:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55585 and previous config saved to /var/cache/conftool/dbconfig/20240124-233439-marostegui.json
  • 23:34 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2103.codfw.wmnet with OS bullseye
  • 23:33 jforrester@deploy2002: Finished scap: Backport for Revert "Update

    spacing to improve consistency of ul/ol spacing, also update heading spacing to be more consistent, relying on mw defaults more" (T355805 T354433) (duration: 13m 29s)

  • 23:32 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2105.codfw.wmnet with OS bullseye
  • 23:32 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2104.codfw.wmnet with OS bullseye
  • 23:26 jforrester@deploy2002: jforrester: Continuing with sync
  • 23:21 jforrester@deploy2002: jforrester: Backport for Revert "Update

    spacing to improve consistency of ul/ol spacing, also update heading spacing to be more consistent, relying on mw defaults more" (T355805 T354433) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

  • 23:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T354336)', diff saved to https://phabricator.wikimedia.org/P55584 and previous config saved to /var/cache/conftool/dbconfig/20240124-231933-marostegui.json
  • 23:19 jforrester@deploy2002: Started scap: Backport for Revert "Update

    spacing to improve consistency of ul/ol spacing, also update heading spacing to be more consistent, relying on mw defaults more" (T355805 T354433)

  • 23:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1243 (T354336)', diff saved to https://phabricator.wikimedia.org/P55583 and previous config saved to /var/cache/conftool/dbconfig/20240124-231723-marostegui.json
  • 23:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 23:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 23:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T354336)', diff saved to https://phabricator.wikimedia.org/P55582 and previous config saved to /var/cache/conftool/dbconfig/20240124-231701-marostegui.json
  • 23:04 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2103.codfw.wmnet with OS bullseye
  • 23:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55581 and previous config saved to /var/cache/conftool/dbconfig/20240124-230155-marostegui.json
  • 22:50 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2106.codfw.wmnet with OS bullseye
  • 22:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55580 and previous config saved to /var/cache/conftool/dbconfig/20240124-224648-marostegui.json
  • 22:39 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: cloduelastic maintenance
  • 22:39 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: cloduelastic maintenance
  • 22:33 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2106.codfw.wmnet with reason: host reimage
  • 22:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T354336)', diff saved to https://phabricator.wikimedia.org/P55579 and previous config saved to /var/cache/conftool/dbconfig/20240124-223142-marostegui.json
  • 22:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1242 (T354336)', diff saved to https://phabricator.wikimedia.org/P55578 and previous config saved to /var/cache/conftool/dbconfig/20240124-222932-marostegui.json
  • 22:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 22:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 22:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T354336)', diff saved to https://phabricator.wikimedia.org/P55577 and previous config saved to /var/cache/conftool/dbconfig/20240124-222910-marostegui.json
  • 22:28 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2106.codfw.wmnet with reason: host reimage
  • 22:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P55576 and previous config saved to /var/cache/conftool/dbconfig/20240124-221403-marostegui.json
  • 22:11 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2105.codfw.wmnet with OS bullseye
  • 22:11 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2106.codfw.wmnet with OS bullseye
  • 22:11 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2104.codfw.wmnet with OS bullseye
  • 22:10 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2103.codfw.wmnet with OS bullseye
  • 21:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P55575 and previous config saved to /var/cache/conftool/dbconfig/20240124-215857-marostegui.json
  • 21:45 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase[2022-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
  • 21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T354336)', diff saved to https://phabricator.wikimedia.org/P55574 and previous config saved to /var/cache/conftool/dbconfig/20240124-214351-marostegui.json
  • 21:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1241 (T354336)', diff saved to https://phabricator.wikimedia.org/P55573 and previous config saved to /var/cache/conftool/dbconfig/20240124-214141-marostegui.json
  • 21:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 21:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 21:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T354336)', diff saved to https://phabricator.wikimedia.org/P55572 and previous config saved to /var/cache/conftool/dbconfig/20240124-214120-marostegui.json
  • 21:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P55571 and previous config saved to /var/cache/conftool/dbconfig/20240124-212613-marostegui.json
  • 21:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P55570 and previous config saved to /var/cache/conftool/dbconfig/20240124-211107-marostegui.json
  • 21:05 aqu@deploy2002: Finished deploy [airflow-dags/analytics@5a0681b]: Regular analytics weekly train [airflow-dags/analytics@5a0681bc] (duration: 00m 37s)
  • 21:05 aqu@deploy2002: Started deploy [airflow-dags/analytics@5a0681b]: Regular analytics weekly train [airflow-dags/analytics@5a0681bc]
  • 20:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T354336)', diff saved to https://phabricator.wikimedia.org/P55569 and previous config saved to /var/cache/conftool/dbconfig/20240124-205600-marostegui.json
  • 20:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1238 (T354336)', diff saved to https://phabricator.wikimedia.org/P55568 and previous config saved to /var/cache/conftool/dbconfig/20240124-205350-marostegui.json
  • 20:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 20:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 20:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T354336)', diff saved to https://phabricator.wikimedia.org/P55567 and previous config saved to /var/cache/conftool/dbconfig/20240124-205327-marostegui.json
  • 20:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P55566 and previous config saved to /var/cache/conftool/dbconfig/20240124-203821-marostegui.json
  • 20:38 fab@deploy2002: Finished deploy [airflow-dags/research@2f514fc]: (no justification provided) (duration: 00m 33s)
  • 20:37 fab@deploy2002: Started deploy [airflow-dags/research@2f514fc]: (no justification provided)
  • 20:26 zabe: zabe@mwmaint2002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=scowiki --logwiki=metawiki 'TheBabushka' 'AshotGPT' # T355743
  • 20:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P55565 and previous config saved to /var/cache/conftool/dbconfig/20240124-202315-marostegui.json
  • 20:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T354336)', diff saved to https://phabricator.wikimedia.org/P55564 and previous config saved to /var/cache/conftool/dbconfig/20240124-200808-marostegui.json
  • 20:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1221 (T354336)', diff saved to https://phabricator.wikimedia.org/P55563 and previous config saved to /var/cache/conftool/dbconfig/20240124-200659-marostegui.json
  • 20:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 20:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 20:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 20:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 20:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T354336)', diff saved to https://phabricator.wikimedia.org/P55562 and previous config saved to /var/cache/conftool/dbconfig/20240124-200619-marostegui.json
  • 20:02 cstone: payments-wiki upgraded from a3691a8e to 8cfbbb4b
  • 19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P55561 and previous config saved to /var/cache/conftool/dbconfig/20240124-195113-marostegui.json
  • 19:39 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:38 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P55560 and previous config saved to /var/cache/conftool/dbconfig/20240124-193606-marostegui.json
  • 19:35 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:34 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:34 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:24 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:23 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T354336)', diff saved to https://phabricator.wikimedia.org/P55559 and previous config saved to /var/cache/conftool/dbconfig/20240124-192100-marostegui.json
  • 19:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1199 (T354336)', diff saved to https://phabricator.wikimedia.org/P55558 and previous config saved to /var/cache/conftool/dbconfig/20240124-191850-marostegui.json
  • 19:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 19:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 19:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55557 and previous config saved to /var/cache/conftool/dbconfig/20240124-191828-marostegui.json
  • 19:16 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2022-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
  • 19:13 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase[2017-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
  • 19:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P55555 and previous config saved to /var/cache/conftool/dbconfig/20240124-190322-marostegui.json
  • 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P55554 and previous config saved to /var/cache/conftool/dbconfig/20240124-184815-marostegui.json
  • 18:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55553 and previous config saved to /var/cache/conftool/dbconfig/20240124-183308-marostegui.json
  • 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55552 and previous config saved to /var/cache/conftool/dbconfig/20240124-183059-marostegui.json
  • 18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 18:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 18:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 18:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 18:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55551 and previous config saved to /var/cache/conftool/dbconfig/20240124-183001-marostegui.json
  • 18:24 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2017-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
  • 18:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P55550 and previous config saved to /var/cache/conftool/dbconfig/20240124-181455-marostegui.json
  • 18:09 mfossati@deploy2002: Finished deploy [airflow-dags/platform_eng@fed6de3]: (no justification provided) (duration: 00m 32s)
  • 18:08 mfossati@deploy2002: Started deploy [airflow-dags/platform_eng@fed6de3]: (no justification provided)
  • 17:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P55549 and previous config saved to /var/cache/conftool/dbconfig/20240124-175948-marostegui.json
  • 17:50 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 17:50 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 17:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 17:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 17:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55548 and previous config saved to /var/cache/conftool/dbconfig/20240124-174442-marostegui.json
  • 17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1146:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55547 and previous config saved to /var/cache/conftool/dbconfig/20240124-174332-marostegui.json
  • 17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 17:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 17:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55546 and previous config saved to /var/cache/conftool/dbconfig/20240124-174251-marostegui.json
  • 17:35 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase[2015-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
  • 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P55545 and previous config saved to /var/cache/conftool/dbconfig/20240124-172745-marostegui.json
  • 17:24 hashar@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.15 refs T354433 (duration: 07m 10s)
  • 17:17 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2015-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
  • 17:16 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.15 refs T354433
  • 17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P55544 and previous config saved to /var/cache/conftool/dbconfig/20240124-171238-marostegui.json
  • 17:10 sukhe: sudo cumin -b1 -s60 "R:Class = Bird" "enable-puppet 'CR991699' && run-puppet-agent"
  • 17:09 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase103[1-3].eqiad.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
  • 17:06 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@16476a9] (releasing): (no justification provided) (duration: 01m 07s)
  • 17:06 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@16476a9] (releasing): (no justification provided)
  • 17:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2053.codfw.wmnet
  • 17:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1053.eqiad.wmnet
  • 16:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2053.codfw.wmnet
  • 16:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1053.eqiad.wmnet
  • 16:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55543 and previous config saved to /var/cache/conftool/dbconfig/20240124-165732-marostegui.json
  • 16:56 vgutierrez: enable puppet on cp3066 - T354424
  • 16:55 sukhe: enable puppet on durum1001 to test CR 991699
  • 16:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1144:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55542 and previous config saved to /var/cache/conftool/dbconfig/20240124-165522-marostegui.json
  • 16:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 16:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 16:54 XioNoX: disable puppet on all the hosts running bird to deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/991699
  • 16:39 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase103[1-3].eqiad.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
  • 16:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 16:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 16:30 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching A:restbase-eqiad: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
  • 16:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T354336)', diff saved to https://phabricator.wikimedia.org/P55541 and previous config saved to /var/cache/conftool/dbconfig/20240124-162532-marostegui.json
  • 16:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P55540 and previous config saved to /var/cache/conftool/dbconfig/20240124-161026-marostegui.json
  • 16:04 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:04 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 16:03 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:03 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 15:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host phab2002.codfw.wmnet
  • 15:57 hashar@deploy2002: Synchronized php-1.42.0-wmf.15/extensions/Echo/includes/Formatters/EchoRevertedPresentationModel.php: Fix EchoRevertedPresentationModel using null as string - T355751 (duration: 09m 06s)
  • 15:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P55539 and previous config saved to /var/cache/conftool/dbconfig/20240124-155519-marostegui.json
  • 15:50 vgutierrez: disable puppet on cp3066 - T354424
  • 15:48 sukhe: sudo cumin -b1 -s120 'A:dns-rec' "enable-puppet 'merging CR 980929' && run-puppet-agent"
  • 15:47 hashar@deploy2002: Synchronized php-1.42.0-wmf.15/extensions/CentralAuth/tests/phpunit/CentralAuthIdLookupTest.php: Fix CentralIdLookup tests (duration: 11m 18s)
  • 15:45 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2446.codfw.wmnet with OS bullseye
  • 15:42 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2430.codfw.wmnet with OS bullseye
  • 15:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T354336)', diff saved to https://phabricator.wikimedia.org/P55538 and previous config saved to /var/cache/conftool/dbconfig/20240124-154013-marostegui.json
  • 15:39 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2427.codfw.wmnet with OS bullseye
  • 15:38 sukhe: sudo cumin 'A:dns-rec' "disable-puppet 'merging CR 980929'"
  • 15:38 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:38 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 15:38 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 15:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T354336)', diff saved to https://phabricator.wikimedia.org/P55537 and previous config saved to /var/cache/conftool/dbconfig/20240124-153752-marostegui.json
  • 15:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 15:37 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 15:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 15:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T354336)', diff saved to https://phabricator.wikimedia.org/P55536 and previous config saved to /var/cache/conftool/dbconfig/20240124-153730-marostegui.json
  • 15:37 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host phab2002.codfw.wmnet
  • 15:37 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 15:36 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 15:32 moritzm: imported jenkins 2.426.3 for buster/bullseye T355503
  • 15:25 aqu@deploy2002: Finished deploy [airflow-dags/analytics@da2e61c]: Regular analytics weekly train [airflow-dags/analytics@da2e61c7] (duration: 00m 42s)
  • 15:25 aqu@deploy2002: Started deploy [airflow-dags/analytics@da2e61c]: Regular analytics weekly train [airflow-dags/analytics@da2e61c7]
  • 15:25 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2446.codfw.wmnet with reason: host reimage
  • 15:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P55534 and previous config saved to /var/cache/conftool/dbconfig/20240124-152224-marostegui.json
  • 15:22 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2430.codfw.wmnet with reason: host reimage
  • 15:21 aqu: Refinery weekly deployment train - end (scap, then deployed onto hdfs) (test cluster deploy still broken T354703)
  • 15:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2427.codfw.wmnet with reason: host reimage
  • 15:17 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2430.codfw.wmnet with reason: host reimage
  • 15:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2446.codfw.wmnet with reason: host reimage
  • 15:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2427.codfw.wmnet with reason: host reimage
  • 15:12 aqu@deploy2002: Finished deploy [analytics/refinery@13f7a06] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@13f7a06c] (duration: 03m 28s)
  • 15:11 moritzm: uploading pymsql 1.0.2-2~wmf11u1 to apt.wikimedia.org T355531
  • 15:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2055.codfw.wmnet
  • 15:08 aqu@deploy2002: Started deploy [analytics/refinery@13f7a06] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@13f7a06c]
  • 15:08 aqu@deploy2002: Finished deploy [analytics/refinery@13f7a06] (thin): Regular analytics weekly train THIN [analytics/refinery@13f7a06c] (duration: 00m 05s)
  • 15:08 aqu@deploy2002: Started deploy [analytics/refinery@13f7a06] (thin): Regular analytics weekly train THIN [analytics/refinery@13f7a06c]
  • 15:07 aqu@deploy2002: Finished deploy [analytics/refinery@13f7a06]: Regular analytics weekly train [analytics/refinery@13f7a06c] (duration: 10m 12s)
  • 15:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P55533 and previous config saved to /var/cache/conftool/dbconfig/20240124-150718-marostegui.json
  • 15:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2055.codfw.wmnet
  • 14:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2446.codfw.wmnet with OS bullseye
  • 14:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2430.codfw.wmnet with OS bullseye
  • 14:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2427.codfw.wmnet with OS bullseye
  • 14:57 aqu@deploy2002: Started deploy [analytics/refinery@13f7a06]: Regular analytics weekly train [analytics/refinery@13f7a06c]
  • 14:57 akosiaris@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:57 akosiaris@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:56 akosiaris@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:56 akosiaris@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:56 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 14:56 aqu@deploy2002: Finished deploy [analytics/refinery@d1ee04c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d1ee04cc] (duration: 03m 40s)
  • 14:56 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 14:55 akosiaris: bump eventrouter limits/requests memory/cpu
  • 14:55 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 14:55 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['elastic2094.codfw.wmnet']
  • 14:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 14:52 aqu@deploy2002: Started deploy [analytics/refinery@d1ee04c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d1ee04cc]
  • 14:52 aqu@deploy2002: Finished deploy [analytics/refinery@d1ee04c] (thin): Regular analytics weekly train THIN [analytics/refinery@d1ee04cc] (duration: 00m 06s)
  • 14:52 aqu@deploy2002: Started deploy [analytics/refinery@d1ee04c] (thin): Regular analytics weekly train THIN [analytics/refinery@d1ee04cc]
  • 14:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T354336)', diff saved to https://phabricator.wikimedia.org/P55532 and previous config saved to /var/cache/conftool/dbconfig/20240124-145211-marostegui.json
  • 14:51 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:50 aqu@deploy2002: Finished deploy [analytics/refinery@d1ee04c]: Regular analytics weekly train [analytics/refinery@d1ee04cc] (duration: 09m 11s)
  • 14:50 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for cswiki: remove unused birthday logo files (duration: 09m 36s)
  • 14:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T354336)', diff saved to https://phabricator.wikimedia.org/P55531 and previous config saved to /var/cache/conftool/dbconfig/20240124-144947-marostegui.json
  • 14:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 14:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 14:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T354336)', diff saved to https://phabricator.wikimedia.org/P55530 and previous config saved to /var/cache/conftool/dbconfig/20240124-144925-marostegui.json
  • 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2054.codfw.wmnet
  • 14:44 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
  • 14:43 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for cswiki: remove unused birthday logo files synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:41 aqu@deploy2002: Started deploy [analytics/refinery@d1ee04c]: Regular analytics weekly train [analytics/refinery@d1ee04cc]
  • 14:41 aqu@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 14:41 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for cswiki: remove unused birthday logo files
  • 14:40 aqu@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 14:39 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [azwiki] Add new namespace aliases (T355041) (duration: 10m 00s)
  • 14:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2054.codfw.wmnet
  • 14:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1054.eqiad.wmnet
  • 14:37 aqu@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 14:36 aqu@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 14:36 aqu@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 14:35 aqu@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 14:35 aqu@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 14:35 aqu@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 14:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
  • 14:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P55529 and previous config saved to /var/cache/conftool/dbconfig/20240124-143419-marostegui.json
  • 14:34 aqu@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 14:33 aqu@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 14:33 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1054.eqiad.wmnet
  • 14:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
  • 14:31 aqu: analytics/refinery weekly deployment train - begin
  • 14:31 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2052.codfw.wmnet
  • 14:31 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1052.eqiad.wmnet
  • 14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [azwiki] Add new namespace aliases (T355041) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:29 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
  • 14:29 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [azwiki] Add new namespace aliases (T355041)
  • 14:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [ganwiki] Change autoconfirmed setting (T355126) (duration: 09m 51s)
  • 14:26 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2094.codfw.wmnet']
  • 14:25 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['elastic2094.codfw.wmnet']
  • 14:25 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2094.codfw.wmnet']
  • 14:25 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2094.codfw.wmnet']
  • 14:25 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2094.codfw.wmnet']
  • 14:25 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2052.codfw.wmnet
  • 14:25 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1052.eqiad.wmnet
  • 14:25 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
  • 14:24 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
  • 14:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
  • 14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Continuing with sync
  • 14:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P55527 and previous config saved to /var/cache/conftool/dbconfig/20240124-141912-marostegui.json
  • 14:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Backport for [ganwiki] Change autoconfirmed setting (T355126) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:17 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [ganwiki] Change autoconfirmed setting (T355126)
  • 14:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798) (duration: 10m 52s)
  • 14:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2053.codfw.wmnet
  • 14:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Continuing with sync
  • 14:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Backport for Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2053.codfw.wmnet
  • 14:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T354336)', diff saved to https://phabricator.wikimedia.org/P55526 and previous config saved to /var/cache/conftool/dbconfig/20240124-140406-marostegui.json
  • 14:04 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ml-serve2005.codfw.wmnet with reason: Machine move (T355437)
  • 14:04 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798)
  • 14:03 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ml-serve2005.codfw.wmnet with reason: Machine move (T355437)
  • 14:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T354336)', diff saved to https://phabricator.wikimedia.org/P55525 and previous config saved to /var/cache/conftool/dbconfig/20240124-140142-marostegui.json
  • 14:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 14:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 14:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T354336)', diff saved to https://phabricator.wikimedia.org/P55524 and previous config saved to /var/cache/conftool/dbconfig/20240124-140120-marostegui.json
  • 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1053.eqiad.wmnet
  • 13:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55523 and previous config saved to /var/cache/conftool/dbconfig/20240124-135424-root.json
  • 13:50 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1053.eqiad.wmnet
  • 13:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P55522 and previous config saved to /var/cache/conftool/dbconfig/20240124-134614-marostegui.json
  • 13:39 samtar@deploy2002: Finished scap: Backport for Added Diff to approved list of RSS feeds for Foundation Governance Wiki and removed inoperative feed. (T354790) (duration: 09m 14s)
  • 13:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55521 and previous config saved to /var/cache/conftool/dbconfig/20240124-133919-root.json
  • 13:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2051.codfw.wmnet
  • 13:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1051.eqiad.wmnet
  • 13:32 samtar@deploy2002: samtar and varnent: Continuing with sync
  • 13:32 samtar@deploy2002: samtar and varnent: Backport for Added Diff to approved list of RSS feeds for Foundation Governance Wiki and removed inoperative feed. (T354790) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:31 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1051.eqiad.wmnet
  • 13:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P55520 and previous config saved to /var/cache/conftool/dbconfig/20240124-133107-marostegui.json
  • 13:31 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2051.codfw.wmnet
  • 13:30 samtar@deploy2002: Started scap: Backport for Added Diff to approved list of RSS feeds for Foundation Governance Wiki and removed inoperative feed. (T354790)
  • 13:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55519 and previous config saved to /var/cache/conftool/dbconfig/20240124-132414-root.json
  • 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T354336)', diff saved to https://phabricator.wikimedia.org/P55518 and previous config saved to /var/cache/conftool/dbconfig/20240124-131600-marostegui.json
  • 13:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55517 and previous config saved to /var/cache/conftool/dbconfig/20240124-130909-root.json
  • 12:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55516 and previous config saved to /var/cache/conftool/dbconfig/20240124-125404-root.json
  • 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2052.codfw.wmnet
  • 12:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P55515 and previous config saved to /var/cache/conftool/dbconfig/20240124-123859-root.json
  • 12:34 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2052.codfw.wmnet
  • 12:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1052.eqiad.wmnet
  • 12:28 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1052.eqiad.wmnet
  • 12:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P55514 and previous config saved to /var/cache/conftool/dbconfig/20240124-122354-root.json
  • 12:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1231 T355760', diff saved to https://phabricator.wikimedia.org/P55513 and previous config saved to /var/cache/conftool/dbconfig/20240124-122148-root.json
  • 12:20 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1173 to s6 primary T355760', diff saved to https://phabricator.wikimedia.org/P55512 and previous config saved to /var/cache/conftool/dbconfig/20240124-122030-marostegui.json
  • 12:19 marostegui: Starting s6 eqiad failover from db1231 to db1173 - T355760
  • 12:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 12:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 12:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 12:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55510 and previous config saved to /var/cache/conftool/dbconfig/20240124-121448-marostegui.json
  • 12:07 ladsgroup@deploy2002: Finished scap: Backport for GenerateFancyCaptchas: Add ->disableSandbox() to shell command (duration: 09m 55s)
  • 12:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355760
  • 12:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355760
  • 12:00 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P55509 and previous config saved to /var/cache/conftool/dbconfig/20240124-115942-marostegui.json
  • 11:58 ladsgroup@deploy2002: ladsgroup: Backport for GenerateFancyCaptchas: Add ->disableSandbox() to shell command synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:58 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host acmechief-test1001.eqiad.wmnet
  • 11:57 ladsgroup@deploy2002: Started scap: Backport for GenerateFancyCaptchas: Add ->disableSandbox() to shell command
  • 11:57 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 11:56 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 11:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2050.codfw.wmnet
  • 11:55 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host acmechief-test2001.codfw.wmnet
  • 11:55 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1050.eqiad.wmnet
  • 11:54 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 11:52 hnowlan@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:52 hnowlan@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:49 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2050.codfw.wmnet
  • 11:49 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1050.eqiad.wmnet
  • 11:47 hnowlan@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:46 hnowlan@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P55506 and previous config saved to /var/cache/conftool/dbconfig/20240124-114435-marostegui.json
  • 11:43 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host acmechief-test2001.codfw.wmnet
  • 11:33 vgutierrez: repool cp3066 - T354424
  • 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1014.eqiad.wmnet with OS bullseye
  • 11:32 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 11:32 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 11:31 vgutierrez: depooling cp3066 - T354424
  • 11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55505 and previous config saved to /var/cache/conftool/dbconfig/20240124-112929-marostegui.json
  • 11:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55504 and previous config saved to /var/cache/conftool/dbconfig/20240124-112705-marostegui.json
  • 11:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 11:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55503 and previous config saved to /var/cache/conftool/dbconfig/20240124-112643-marostegui.json
  • 11:26 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 11:26 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 11:24 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 11:24 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P55501 and previous config saved to /var/cache/conftool/dbconfig/20240124-111136-marostegui.json
  • 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
  • 10:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
  • 10:57 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=rowikinews --fix # T350889
  • 10:57 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1173 with weight 0 T355760', diff saved to https://phabricator.wikimedia.org/P55500 and previous config saved to /var/cache/conftool/dbconfig/20240124-105702-root.json
  • 10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P55499 and previous config saved to /var/cache/conftool/dbconfig/20240124-105630-marostegui.json
  • 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1014.eqiad.wmnet with OS bullseye
  • 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1014.eqiad.wmnet
  • 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1017.eqiad.wmnet with OS bullseye
  • 10:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55498 and previous config saved to /var/cache/conftool/dbconfig/20240124-104123-marostegui.json
  • 10:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55497 and previous config saved to /var/cache/conftool/dbconfig/20240124-103900-marostegui.json
  • 10:38 hashar: deployment-server: removing `gerrit` remove from `/srv/mediawiki-staging` given it is tied to a specific username and the `origin` remote already has ssh protocol for push # ping James_F
  • 10:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 10:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T354336)', diff saved to https://phabricator.wikimedia.org/P55496 and previous config saved to /var/cache/conftool/dbconfig/20240124-103837-marostegui.json
  • 10:37 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1014.eqiad.wmnet
  • 10:36 moritzm: upgrading cumin1002 to pymsql 1.0.2-2~wmf11u1 T355531
  • 10:31 hashar@deploy2002: rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.42.0-wmf.15" - T354433
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P55495 and previous config saved to /var/cache/conftool/dbconfig/20240124-102330-marostegui.json
  • 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
  • 10:10 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
  • 10:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P55494 and previous config saved to /var/cache/conftool/dbconfig/20240124-100824-marostegui.json
  • 10:00 vgutierrez: repool cp3066 - T354424
  • 09:58 vgutierrez: depooling cp3066 - T354424
  • 09:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1017.eqiad.wmnet with OS bullseye
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T354336)', diff saved to https://phabricator.wikimedia.org/P55493 and previous config saved to /var/cache/conftool/dbconfig/20240124-095317-marostegui.json
  • 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T354336)', diff saved to https://phabricator.wikimedia.org/P55492 and previous config saved to /var/cache/conftool/dbconfig/20240124-095054-marostegui.json
  • 09:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 09:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T354336)', diff saved to https://phabricator.wikimedia.org/P55491 and previous config saved to /var/cache/conftool/dbconfig/20240124-095032-marostegui.json
  • 09:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: A1 codfw maintenance
  • 09:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: A1 codfw maintenance
  • 09:49 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1037.eqiad.wmnet to cluster eqiad and group C
  • 09:41 ayounsi@cumin2002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f8-eqiad
  • 09:41 ayounsi@cumin2002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
  • 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P55489 and previous config saved to /var/cache/conftool/dbconfig/20240124-093526-marostegui.json
  • 09:32 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f8-eqiad
  • 09:32 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
  • 09:31 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1037.eqiad.wmnet to cluster eqiad and group C
  • 09:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: A1 codfw maintenance T355437
  • 09:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: A1 codfw maintenance T355437
  • 09:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: A1 codfw maintenance T355437
  • 09:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: A1 codfw maintenance T355437
  • 09:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: A1 codfw maintenance T355437
  • 09:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: A1 codfw maintenance T355437
  • 09:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2026.codfw.wmnet with reason: A1 codfw maintenance T355437
  • 09:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2026.codfw.wmnet with reason: A1 codfw maintenance T355437
  • 09:27 hashar@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.15 refs T354433 (duration: 06m 55s)
  • 09:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P55488 and previous config saved to /var/cache/conftool/dbconfig/20240124-092019-marostegui.json
  • 09:20 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.15 refs T354433
  • 09:08 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ganeti1037.eqiad.wmnet
  • 09:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T354336)', diff saved to https://phabricator.wikimedia.org/P55487 and previous config saved to /var/cache/conftool/dbconfig/20240124-090512-marostegui.json
  • 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2146 (T354336)', diff saved to https://phabricator.wikimedia.org/P55486 and previous config saved to /var/cache/conftool/dbconfig/20240124-090250-marostegui.json
  • 09:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 09:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T354336)', diff saved to https://phabricator.wikimedia.org/P55485 and previous config saved to /var/cache/conftool/dbconfig/20240124-090228-marostegui.json
  • 08:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
  • 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P55484 and previous config saved to /var/cache/conftool/dbconfig/20240124-084721-marostegui.json
  • 08:45 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti1037.eqiad.wmnet
  • 08:36 hashar@deploy2002: Finished scap: Backport for Use a class for 'LogActionsHandlers' (T355680) (duration: 08m 00s)
  • 08:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P55483 and previous config saved to /var/cache/conftool/dbconfig/20240124-083215-marostegui.json
  • 08:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
  • 08:30 hashar@deploy2002: hashar: Continuing with sync
  • 08:30 hashar@deploy2002: hashar: Backport for Use a class for 'LogActionsHandlers' (T355680) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:28 hashar@deploy2002: Started scap: Backport for Use a class for 'LogActionsHandlers' (T355680)
  • 08:25 logmsgbot: wmde-fisch@deploy2002 Finished scap: Backport for Allow Cite events for reference previews baseline stats (T353798) (duration: 08m 32s)
  • 08:18 logmsgbot: wmde-fisch@deploy2002 wmde-fisch: Continuing with sync
  • 08:18 logmsgbot: wmde-fisch@deploy2002 wmde-fisch: Backport for Allow Cite events for reference previews baseline stats (T353798) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:17 logmsgbot: wmde-fisch@deploy2002 Started scap: Backport for Allow Cite events for reference previews baseline stats (T353798)
  • 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T354336)', diff saved to https://phabricator.wikimedia.org/P55482 and previous config saved to /var/cache/conftool/dbconfig/20240124-081708-marostegui.json
  • 08:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T354336)', diff saved to https://phabricator.wikimedia.org/P55481 and previous config saved to /var/cache/conftool/dbconfig/20240124-081445-marostegui.json
  • 08:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 08:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 08:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 08:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 08:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T354336)', diff saved to https://phabricator.wikimedia.org/P55480 and previous config saved to /var/cache/conftool/dbconfig/20240124-081340-marostegui.json
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55479 and previous config saved to /var/cache/conftool/dbconfig/20240124-081050-root.json
  • 08:07 logmsgbot: wmde-fisch@deploy2002 wmde-fisch: Backport for Allow Cite events for reference previews baseline stats (T353798) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:05 logmsgbot: wmde-fisch@deploy2002 Started scap: Backport for Allow Cite events for reference previews baseline stats (T353798)
  • 07:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P55478 and previous config saved to /var/cache/conftool/dbconfig/20240124-075834-marostegui.json
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55477 and previous config saved to /var/cache/conftool/dbconfig/20240124-075545-root.json
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P55476 and previous config saved to /var/cache/conftool/dbconfig/20240124-074327-marostegui.json
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55475 and previous config saved to /var/cache/conftool/dbconfig/20240124-074040-root.json
  • 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T354336)', diff saved to https://phabricator.wikimedia.org/P55474 and previous config saved to /var/cache/conftool/dbconfig/20240124-072821-marostegui.json
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2130 (T354336)', diff saved to https://phabricator.wikimedia.org/P55473 and previous config saved to /var/cache/conftool/dbconfig/20240124-072557-marostegui.json
  • 07:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55472 and previous config saved to /var/cache/conftool/dbconfig/20240124-072535-root.json
  • 07:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T354336)', diff saved to https://phabricator.wikimedia.org/P55471 and previous config saved to /var/cache/conftool/dbconfig/20240124-072523-marostegui.json
  • 07:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55470 and previous config saved to /var/cache/conftool/dbconfig/20240124-071954-root.json
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55469 and previous config saved to /var/cache/conftool/dbconfig/20240124-071030-root.json
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P55468 and previous config saved to /var/cache/conftool/dbconfig/20240124-071016-marostegui.json
  • 07:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 75%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55467 and previous config saved to /var/cache/conftool/dbconfig/20240124-070449-root.json
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55466 and previous config saved to /var/cache/conftool/dbconfig/20240124-065525-root.json
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P55465 and previous config saved to /var/cache/conftool/dbconfig/20240124-065510-marostegui.json
  • 06:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 50%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55464 and previous config saved to /var/cache/conftool/dbconfig/20240124-064944-root.json
  • 06:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2129.codfw.wmnet with OS bookworm
  • 06:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55463 and previous config saved to /var/cache/conftool/dbconfig/20240124-064020-root.json
  • 06:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T354336)', diff saved to https://phabricator.wikimedia.org/P55462 and previous config saved to /var/cache/conftool/dbconfig/20240124-064003-marostegui.json
  • 06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T354336)', diff saved to https://phabricator.wikimedia.org/P55461 and previous config saved to /var/cache/conftool/dbconfig/20240124-063739-marostegui.json
  • 06:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 06:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112 (T354336)', diff saved to https://phabricator.wikimedia.org/P55460 and previous config saved to /var/cache/conftool/dbconfig/20240124-063717-marostegui.json
  • 06:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 25%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55459 and previous config saved to /var/cache/conftool/dbconfig/20240124-063440-root.json
  • 06:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112', diff saved to https://phabricator.wikimedia.org/P55458 and previous config saved to /var/cache/conftool/dbconfig/20240124-062210-marostegui.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 10%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55457 and previous config saved to /var/cache/conftool/dbconfig/20240124-061934-root.json
  • 06:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2129.codfw.wmnet with reason: host reimage
  • 06:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2129.codfw.wmnet with reason: host reimage
  • 06:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112', diff saved to https://phabricator.wikimedia.org/P55456 and previous config saved to /var/cache/conftool/dbconfig/20240124-060703-marostegui.json
  • 06:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 5%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55455 and previous config saved to /var/cache/conftool/dbconfig/20240124-060429-root.json
  • 05:58 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2129.codfw.wmnet with OS bookworm
  • 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129 T354506', diff saved to https://phabricator.wikimedia.org/P55454 and previous config saved to /var/cache/conftool/dbconfig/20240124-055635-marostegui.json
  • 05:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112 (T354336)', diff saved to https://phabricator.wikimedia.org/P55453 and previous config saved to /var/cache/conftool/dbconfig/20240124-055157-marostegui.json
  • 05:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2158 db2157 es2026 db2136 T355437', diff saved to https://phabricator.wikimedia.org/P55452 and previous config saved to /var/cache/conftool/dbconfig/20240124-055143-marostegui.json
  • 05:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2112 (T354336)', diff saved to https://phabricator.wikimedia.org/P55451 and previous config saved to /var/cache/conftool/dbconfig/20240124-054932-marostegui.json
  • 05:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2112.codfw.wmnet with reason: Maintenance
  • 05:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 1%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55450 and previous config saved to /var/cache/conftool/dbconfig/20240124-054924-root.json
  • 05:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2112.codfw.wmnet with reason: Maintenance
  • 05:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 05:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 05:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 05:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 02:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 02:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 02:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T354336)', diff saved to https://phabricator.wikimedia.org/P55449 and previous config saved to /var/cache/conftool/dbconfig/20240124-023210-marostegui.json
  • 02:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P55448 and previous config saved to /var/cache/conftool/dbconfig/20240124-021704-marostegui.json
  • 02:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P55447 and previous config saved to /var/cache/conftool/dbconfig/20240124-020157-marostegui.json
  • 01:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T354336)', diff saved to https://phabricator.wikimedia.org/P55445 and previous config saved to /var/cache/conftool/dbconfig/20240124-014651-marostegui.json
  • 01:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T354336)', diff saved to https://phabricator.wikimedia.org/P55444 and previous config saved to /var/cache/conftool/dbconfig/20240124-014430-marostegui.json
  • 01:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 01:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 01:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T354336)', diff saved to https://phabricator.wikimedia.org/P55443 and previous config saved to /var/cache/conftool/dbconfig/20240124-014408-marostegui.json
  • 01:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P55442 and previous config saved to /var/cache/conftool/dbconfig/20240124-012902-marostegui.json
  • 01:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P55441 and previous config saved to /var/cache/conftool/dbconfig/20240124-011355-marostegui.json
  • 00:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T354336)', diff saved to https://phabricator.wikimedia.org/P55440 and previous config saved to /var/cache/conftool/dbconfig/20240124-005849-marostegui.json
  • 00:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T354336)', diff saved to https://phabricator.wikimedia.org/P55439 and previous config saved to /var/cache/conftool/dbconfig/20240124-005627-marostegui.json
  • 00:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 00:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 00:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T354336)', diff saved to https://phabricator.wikimedia.org/P55438 and previous config saved to /var/cache/conftool/dbconfig/20240124-005605-marostegui.json
  • 00:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P55437 and previous config saved to /var/cache/conftool/dbconfig/20240124-004058-marostegui.json
  • 00:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P55436 and previous config saved to /var/cache/conftool/dbconfig/20240124-002551-marostegui.json
  • 00:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T354336)', diff saved to https://phabricator.wikimedia.org/P55435 and previous config saved to /var/cache/conftool/dbconfig/20240124-001044-marostegui.json
  • 00:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1228 (T354336)', diff saved to https://phabricator.wikimedia.org/P55434 and previous config saved to /var/cache/conftool/dbconfig/20240124-000824-marostegui.json
  • 00:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 00:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 00:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T354336)', diff saved to https://phabricator.wikimedia.org/P55433 and previous config saved to /var/cache/conftool/dbconfig/20240124-000802-marostegui.json

2024-01-23

  • 23:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P55432 and previous config saved to /var/cache/conftool/dbconfig/20240123-235255-marostegui.json
  • 23:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P55430 and previous config saved to /var/cache/conftool/dbconfig/20240123-233749-marostegui.json
  • 23:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T354336)', diff saved to https://phabricator.wikimedia.org/P55429 and previous config saved to /var/cache/conftool/dbconfig/20240123-232242-marostegui.json
  • 23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T354336)', diff saved to https://phabricator.wikimedia.org/P55428 and previous config saved to /var/cache/conftool/dbconfig/20240123-232021-marostegui.json
  • 23:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 23:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T354336)', diff saved to https://phabricator.wikimedia.org/P55427 and previous config saved to /var/cache/conftool/dbconfig/20240123-231959-marostegui.json
  • 23:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P55426 and previous config saved to /var/cache/conftool/dbconfig/20240123-230453-marostegui.json
  • 22:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P55425 and previous config saved to /var/cache/conftool/dbconfig/20240123-224946-marostegui.json
  • 22:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T354336)', diff saved to https://phabricator.wikimedia.org/P55424 and previous config saved to /var/cache/conftool/dbconfig/20240123-223439-marostegui.json
  • 22:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T354336)', diff saved to https://phabricator.wikimedia.org/P55423 and previous config saved to /var/cache/conftool/dbconfig/20240123-223215-marostegui.json
  • 22:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 22:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 22:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T354336)', diff saved to https://phabricator.wikimedia.org/P55422 and previous config saved to /var/cache/conftool/dbconfig/20240123-223153-marostegui.json
  • 22:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P55421 and previous config saved to /var/cache/conftool/dbconfig/20240123-221646-marostegui.json
  • 22:03 kostajh: UTC late deploys done
  • 22:02 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwikibooks --signup --ip 195.70.81.86
  • 22:02 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwikibooks --signup --ip 62.232.9.14
  • 22:01 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwiki --signup --ip 195.70.81.86
  • 22:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P55420 and previous config saved to /var/cache/conftool/dbconfig/20240123-220140-marostegui.json
  • 22:01 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwiki --signup --ip 62.232.9.14
  • 21:59 kharlan@deploy2002: Finished scap: Backport for [knwiki] Removing the temporary logo (already reverted) (T338136), [itwiki] Add the 'abusefilter-bypass-blocked-external-domains' right to botadmins (T355694), [enwiki] and [enwikibooks] Throttle exemption for event (T355695) (duration: 15m 33s)
  • 21:53 kharlan@deploy2002: superpes and kharlan: Continuing with sync
  • 21:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T354336)', diff saved to https://phabricator.wikimedia.org/P55419 and previous config saved to /var/cache/conftool/dbconfig/20240123-214633-marostegui.json
  • 21:45 kharlan@deploy2002: superpes and kharlan: Backport for [knwiki] Removing the temporary logo (already reverted) (T338136), [itwiki] Add the 'abusefilter-bypass-blocked-external-domains' right to botadmins (T355694), [enwiki] and [enwikibooks] Throttle exemption for event (T355695) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T354336)', diff saved to https://phabricator.wikimedia.org/P55418 and previous config saved to /var/cache/conftool/dbconfig/20240123-214413-marostegui.json
  • 21:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 21:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T354336)', diff saved to https://phabricator.wikimedia.org/P55417 and previous config saved to /var/cache/conftool/dbconfig/20240123-214351-marostegui.json
  • 21:43 kharlan@deploy2002: Started scap: Backport for [knwiki] Removing the temporary logo (already reverted) (T338136), [itwiki] Add the 'abusefilter-bypass-blocked-external-domains' right to botadmins (T355694), [enwiki] and [enwikibooks] Throttle exemption for event (T355695)
  • 21:36 kharlan@deploy2002: Finished scap: Backport for revertrisk: Fix i18n message reference (T348298), revertrisk: Fix i18n messages (T348298) (duration: 30m 51s)
  • 21:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P55416 and previous config saved to /var/cache/conftool/dbconfig/20240123-212845-marostegui.json
  • 21:26 kharlan@deploy2002: kharlan: Continuing with sync
  • 21:26 kharlan@deploy2002: kharlan: Backport for revertrisk: Fix i18n message reference (T348298), revertrisk: Fix i18n messages (T348298) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P55415 and previous config saved to /var/cache/conftool/dbconfig/20240123-211338-marostegui.json
  • 21:05 kharlan@deploy2002: Started scap: Backport for revertrisk: Fix i18n message reference (T348298), revertrisk: Fix i18n messages (T348298)
  • 20:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T354336)', diff saved to https://phabricator.wikimedia.org/P55414 and previous config saved to /var/cache/conftool/dbconfig/20240123-205832-marostegui.json
  • 20:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T354336)', diff saved to https://phabricator.wikimedia.org/P55413 and previous config saved to /var/cache/conftool/dbconfig/20240123-205611-marostegui.json
  • 20:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 20:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 20:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T354336)', diff saved to https://phabricator.wikimedia.org/P55412 and previous config saved to /var/cache/conftool/dbconfig/20240123-205549-marostegui.json
  • 20:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P55411 and previous config saved to /var/cache/conftool/dbconfig/20240123-204043-marostegui.json
  • 20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P55410 and previous config saved to /var/cache/conftool/dbconfig/20240123-202536-marostegui.json
  • 20:23 cstone: payments-wiki upgraded from c2138768 to a3691a8e
  • 20:23 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 20:12 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 20:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T354336)', diff saved to https://phabricator.wikimedia.org/P55409 and previous config saved to /var/cache/conftool/dbconfig/20240123-201030-marostegui.json
  • 20:08 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 20:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T354336)', diff saved to https://phabricator.wikimedia.org/P55408 and previous config saved to /var/cache/conftool/dbconfig/20240123-200809-marostegui.json
  • 20:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 20:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 20:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 20:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 20:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T354336)', diff saved to https://phabricator.wikimedia.org/P55407 and previous config saved to /var/cache/conftool/dbconfig/20240123-200726-marostegui.json
  • 19:57 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 19:57 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet, repooling both afterwards
  • 19:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P55406 and previous config saved to /var/cache/conftool/dbconfig/20240123-195220-marostegui.json
  • 19:49 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet, repooling both afterwards
  • 19:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs[2024-2025].codfw.wmnet with reason: testing data xfter cookbook
  • 19:45 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs[2024-2025].codfw.wmnet with reason: testing data xfter cookbook
  • 19:45 mutante: phab1004 - /srv/phab/phabricator/bin/mail volume
  • 19:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P55405 and previous config saved to /var/cache/conftool/dbconfig/20240123-193713-marostegui.json
  • 19:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T354336)', diff saved to https://phabricator.wikimedia.org/P55404 and previous config saved to /var/cache/conftool/dbconfig/20240123-192207-marostegui.json
  • 19:21 ejegg: fundraising civicrm upgraded from d8b0c977 to b85b6dde
  • 19:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T354336)', diff saved to https://phabricator.wikimedia.org/P55403 and previous config saved to /var/cache/conftool/dbconfig/20240123-191945-marostegui.json
  • 19:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 19:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 19:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T354336)', diff saved to https://phabricator.wikimedia.org/P55402 and previous config saved to /var/cache/conftool/dbconfig/20240123-191922-marostegui.json
  • 19:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P55401 and previous config saved to /var/cache/conftool/dbconfig/20240123-190416-marostegui.json
  • 18:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P55400 and previous config saved to /var/cache/conftool/dbconfig/20240123-184909-marostegui.json
  • 18:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
  • 18:37 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
  • 18:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
  • 18:36 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
  • 18:35 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
  • 18:35 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
  • 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T354336)', diff saved to https://phabricator.wikimedia.org/P55399 and previous config saved to /var/cache/conftool/dbconfig/20240123-183403-marostegui.json
  • 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T354336)', diff saved to https://phabricator.wikimedia.org/P55398 and previous config saved to /var/cache/conftool/dbconfig/20240123-183141-marostegui.json
  • 18:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 18:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55397 and previous config saved to /var/cache/conftool/dbconfig/20240123-183120-marostegui.json
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P55396 and previous config saved to /var/cache/conftool/dbconfig/20240123-181613-marostegui.json
  • 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P55395 and previous config saved to /var/cache/conftool/dbconfig/20240123-180107-marostegui.json
  • 17:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55394 and previous config saved to /var/cache/conftool/dbconfig/20240123-174600-marostegui.json
  • 17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55393 and previous config saved to /var/cache/conftool/dbconfig/20240123-174339-marostegui.json
  • 17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 17:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 17:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 17:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 17:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T354336)', diff saved to https://phabricator.wikimedia.org/P55392 and previous config saved to /var/cache/conftool/dbconfig/20240123-174215-marostegui.json
  • 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P55391 and previous config saved to /var/cache/conftool/dbconfig/20240123-172709-marostegui.json
  • 17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P55390 and previous config saved to /var/cache/conftool/dbconfig/20240123-171202-marostegui.json
  • 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T354336)', diff saved to https://phabricator.wikimedia.org/P55389 and previous config saved to /var/cache/conftool/dbconfig/20240123-165656-marostegui.json
  • 16:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1135 (T354336)', diff saved to https://phabricator.wikimedia.org/P55388 and previous config saved to /var/cache/conftool/dbconfig/20240123-165433-marostegui.json
  • 16:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 16:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 16:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 16:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 16:49 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1003.eqiad.wmnet with OS bookworm
  • 16:39 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest1003.eqiad.wmnet with OS bookworm
  • 16:14 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f8-eqiad
  • 16:14 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
  • 16:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55387 and previous config saved to /var/cache/conftool/dbconfig/20240123-161426-root.json
  • 16:10 sukhe: enable puppet on A:lvs to merge CR 991785 and run agent on all nodes
  • 15:59 sukhe: disable puppet on A:lvs to merge CR 991785
  • 15:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55386 and previous config saved to /var/cache/conftool/dbconfig/20240123-155921-root.json
  • 15:55 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:54 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:54 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:53 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:52 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:52 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55385 and previous config saved to /var/cache/conftool/dbconfig/20240123-155219-ladsgroup.json
  • 15:44 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55384 and previous config saved to /var/cache/conftool/dbconfig/20240123-154416-root.json
  • 15:41 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
  • 15:39 claime: trafficserver: move 30% of traffic to mw on k8s - T355532
  • 15:37 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 15:37 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 15:37 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 15:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55383 and previous config saved to /var/cache/conftool/dbconfig/20240123-153712-ladsgroup.json
  • 15:36 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 15:36 claime: Bumping mw-api-ext replicas - T355532
  • 15:36 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 15:36 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 15:35 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 15:35 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 15:35 claime: Bumping mw-web replicas - T355532
  • 15:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
  • 15:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
  • 15:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
  • 15:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
  • 15:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
  • 15:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
  • 15:29 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55382 and previous config saved to /var/cache/conftool/dbconfig/20240123-152911-root.json
  • 15:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
  • 15:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55381 and previous config saved to /var/cache/conftool/dbconfig/20240123-152206-ladsgroup.json
  • 15:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
  • 15:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
  • 15:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
  • 15:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
  • 15:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
  • 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55380 and previous config saved to /var/cache/conftool/dbconfig/20240123-151406-root.json
  • 15:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 15:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
  • 15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
  • 15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
  • 15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
  • 15:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55379 and previous config saved to /var/cache/conftool/dbconfig/20240123-150659-ladsgroup.json
  • 15:06 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
  • 15:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
  • 15:00 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:59 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for ORES: Enable renamed revertrisklanguageagnostic model (T348298) (duration: 11m 20s)
  • 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55378 and previous config saved to /var/cache/conftool/dbconfig/20240123-145901-root.json
  • 14:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T354336)', diff saved to https://phabricator.wikimedia.org/P55377 and previous config saved to /var/cache/conftool/dbconfig/20240123-145353-marostegui.json
  • 14:53 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and kharlan: Continuing with sync
  • 14:49 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and kharlan: Backport for ORES: Enable renamed revertrisklanguageagnostic model (T348298) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:48 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for ORES: Enable renamed revertrisklanguageagnostic model (T348298)
  • 14:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1173.eqiad.wmnet with OS bookworm
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55376 and previous config saved to /var/cache/conftool/dbconfig/20240123-144356-root.json
  • 14:42 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Restore support for matching 'LIKE' patterns/wildcards (T355478) (duration: 07m 50s)
  • 14:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P55375 and previous config saved to /var/cache/conftool/dbconfig/20240123-143846-marostegui.json
  • 14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Continuing with sync
  • 14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Backport for Restore support for matching 'LIKE' patterns/wildcards (T355478) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:34 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Restore support for matching 'LIKE' patterns/wildcards (T355478)
  • 14:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Restore support for matching 'LIKE' patterns/wildcards (T355478) (duration: 10m 29s)
  • 14:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet
  • 14:32 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1003.eqiad.wmnet
  • 14:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Continuing with sync
  • 14:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1173.eqiad.wmnet with reason: host reimage
  • 14:24 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Backport for Restore support for matching 'LIKE' patterns/wildcards (T355478) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:24 pt1979@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet
  • 14:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1173.eqiad.wmnet with reason: host reimage
  • 14:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P55374 and previous config saved to /var/cache/conftool/dbconfig/20240123-142339-marostegui.json
  • 14:23 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
  • 14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Restore support for matching 'LIKE' patterns/wildcards (T355478)
  • 14:20 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
  • 14:18 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for ext-EventLogging,ext-EventStreamConfig: Remove mediawiki.special_diff_interactions stream (T353366) (duration: 11m 49s)
  • 14:15 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1003.eqiad.wmnet
  • 14:12 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and phuedx: Continuing with sync
  • 14:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1173.eqiad.wmnet with OS bookworm
  • 14:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and phuedx: Backport for ext-EventLogging,ext-EventStreamConfig: Remove mediawiki.special_diff_interactions stream (T353366) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T354336)', diff saved to https://phabricator.wikimedia.org/P55373 and previous config saved to /var/cache/conftool/dbconfig/20240123-140833-marostegui.json
  • 14:07 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
  • 14:06 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for ext-EventLogging,ext-EventStreamConfig: Remove mediawiki.special_diff_interactions stream (T353366)
  • 14:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1173 (T343718)', diff saved to https://phabricator.wikimedia.org/P55372 and previous config saved to /var/cache/conftool/dbconfig/20240123-140636-ladsgroup.json
  • 14:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 14:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 14:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 14:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T354336)', diff saved to https://phabricator.wikimedia.org/P55371 and previous config saved to /var/cache/conftool/dbconfig/20240123-135819-marostegui.json
  • 13:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 13:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T354336)', diff saved to https://phabricator.wikimedia.org/P55370 and previous config saved to /var/cache/conftool/dbconfig/20240123-135757-marostegui.json
  • 13:52 Dreamy_Jazz: Ran `foreachwikiindblist group0 extensions/MediaModeration/maintenance/resendMatchEmails.php 20200405 --verbose`
  • 13:51 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 13:50 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 13:50 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:49 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 13:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1016.eqiad.wmnet with OS bullseye
  • 13:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P55369 and previous config saved to /var/cache/conftool/dbconfig/20240123-134250-marostegui.json
  • 13:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P55368 and previous config saved to /var/cache/conftool/dbconfig/20240123-132744-marostegui.json
  • 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55367 and previous config saved to /var/cache/conftool/dbconfig/20240123-131909-root.json
  • 13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
  • 13:12 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
  • 13:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T354336)', diff saved to https://phabricator.wikimedia.org/P55366 and previous config saved to /var/cache/conftool/dbconfig/20240123-131237-marostegui.json
  • 13:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T354336)', diff saved to https://phabricator.wikimedia.org/P55365 and previous config saved to /var/cache/conftool/dbconfig/20240123-131027-marostegui.json
  • 13:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 13:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 13:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55364 and previous config saved to /var/cache/conftool/dbconfig/20240123-131005-marostegui.json
  • 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55363 and previous config saved to /var/cache/conftool/dbconfig/20240123-130404-root.json
  • 12:56 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye
  • 12:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P55362 and previous config saved to /var/cache/conftool/dbconfig/20240123-125459-marostegui.json
  • 12:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55361 and previous config saved to /var/cache/conftool/dbconfig/20240123-124859-root.json
  • 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1017.eqiad.wmnet
  • 12:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P55360 and previous config saved to /var/cache/conftool/dbconfig/20240123-123952-marostegui.json
  • 12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55359 and previous config saved to /var/cache/conftool/dbconfig/20240123-123354-root.json
  • 12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55358 and previous config saved to /var/cache/conftool/dbconfig/20240123-123346-root.json
  • 12:31 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1017.eqiad.wmnet
  • 12:28 claime: Restarting killed maintenance job mediawiki_job_MachineVision_prioritize_uncategorized.service
  • 12:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sretest1001.eqiad.wmnet
  • 12:26 kamila@cumin1002: START - Cookbook sre.hosts.remove-downtime for sretest1001.eqiad.wmnet
  • 12:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on sretest1001.eqiad.wmnet with reason: testing the cookbook
  • 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55357 and previous config saved to /var/cache/conftool/dbconfig/20240123-122446-marostegui.json
  • 12:23 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on sretest1001.eqiad.wmnet with reason: testing the cookbook
  • 12:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55356 and previous config saved to /var/cache/conftool/dbconfig/20240123-122336-marostegui.json
  • 12:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 12:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 12:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55355 and previous config saved to /var/cache/conftool/dbconfig/20240123-122314-marostegui.json
  • 12:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55354 and previous config saved to /var/cache/conftool/dbconfig/20240123-122105-root.json
  • 12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55353 and previous config saved to /var/cache/conftool/dbconfig/20240123-121849-root.json
  • 12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55352 and previous config saved to /var/cache/conftool/dbconfig/20240123-121841-root.json
  • 12:17 claime: Restarting ferm.service on k8s node mw1495.eqiad.wmnet - T354855
  • 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1016.eqiad.wmnet
  • 12:14 claime: scap::dsh::scap_proxies: Replace mw1486 by mw1405 - T355622
  • 12:13 Amir1: dropping bv2015_edits table from all wikis (T355594)
  • 12:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P55351 and previous config saved to /var/cache/conftool/dbconfig/20240123-120807-marostegui.json
  • 12:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55350 and previous config saved to /var/cache/conftool/dbconfig/20240123-120600-root.json
  • 12:05 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1016.eqiad.wmnet
  • 12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55349 and previous config saved to /var/cache/conftool/dbconfig/20240123-120344-root.json
  • 12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55348 and previous config saved to /var/cache/conftool/dbconfig/20240123-120335-root.json
  • 12:03 Amir1: dropping bv2009_edits table from all wikis (T355594)
  • 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1017.eqiad.wmnet with OS bullseye
  • 11:54 godog: initial cleanup of replicated thanos blocks - T351927
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P55347 and previous config saved to /var/cache/conftool/dbconfig/20240123-115301-marostegui.json
  • 11:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55346 and previous config saved to /var/cache/conftool/dbconfig/20240123-115055-root.json
  • 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55345 and previous config saved to /var/cache/conftool/dbconfig/20240123-114840-root.json
  • 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55344 and previous config saved to /var/cache/conftool/dbconfig/20240123-114831-root.json
  • 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1173', diff saved to https://phabricator.wikimedia.org/P55343 and previous config saved to /var/cache/conftool/dbconfig/20240123-114826-marostegui.json
  • 11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55342 and previous config saved to /var/cache/conftool/dbconfig/20240123-113754-marostegui.json
  • 11:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55341 and previous config saved to /var/cache/conftool/dbconfig/20240123-113550-root.json
  • 11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55340 and previous config saved to /var/cache/conftool/dbconfig/20240123-113544-marostegui.json
  • 11:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 11:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55339 and previous config saved to /var/cache/conftool/dbconfig/20240123-113522-marostegui.json
  • 11:35 marostegui: Starting s6 eqiad failover from db1173 to db1231 - T355660
  • 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
  • 11:31 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
  • 11:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55338 and previous config saved to /var/cache/conftool/dbconfig/20240123-112420-root.json
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P55336 and previous config saved to /var/cache/conftool/dbconfig/20240123-112016-marostegui.json
  • 11:11 Amir1: dropping pif_edits table from all wikis (T355594)
  • 11:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host snapshot1017.eqiad.wmnet with OS bullseye
  • 11:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55335 and previous config saved to /var/cache/conftool/dbconfig/20240123-110915-root.json
  • 11:07 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1231 with weight 0 T355660', diff saved to https://phabricator.wikimedia.org/P55333 and previous config saved to /var/cache/conftool/dbconfig/20240123-110743-marostegui.json
  • 11:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355660
  • 11:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355660
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55332 and previous config saved to /var/cache/conftool/dbconfig/20240123-110540-root.json
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P55331 and previous config saved to /var/cache/conftool/dbconfig/20240123-110509-marostegui.json
  • 10:58 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-master1002.eqiad.wmnet
  • 10:58 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:58 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 10:56 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 10:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2171.codfw.wmnet with OS bookworm
  • 10:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55330 and previous config saved to /var/cache/conftool/dbconfig/20240123-105410-root.json
  • 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55329 and previous config saved to /var/cache/conftool/dbconfig/20240123-105035-root.json
  • 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55328 and previous config saved to /var/cache/conftool/dbconfig/20240123-105003-marostegui.json
  • 10:48 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55327 and previous config saved to /var/cache/conftool/dbconfig/20240123-104753-marostegui.json
  • 10:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 10:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55326 and previous config saved to /var/cache/conftool/dbconfig/20240123-104731-marostegui.json
  • 10:43 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-master1002.eqiad.wmnet
  • 10:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2171.codfw.wmnet with reason: host reimage
  • 10:34 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-master1001.eqiad.wmnet
  • 10:34 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:34 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 10:32 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P55325 and previous config saved to /var/cache/conftool/dbconfig/20240123-103225-marostegui.json
  • 10:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2171.codfw.wmnet with reason: host reimage
  • 10:27 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 10:23 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1017.eqiad.wmnet with OS bullseye
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P55324 and previous config saved to /var/cache/conftool/dbconfig/20240123-101718-marostegui.json
  • 10:13 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet
  • 10:13 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1003.eqiad.wmnet
  • 10:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2171.codfw.wmnet with OS bookworm
  • 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2171:3315 db2171:3316', diff saved to https://phabricator.wikimedia.org/P55323 and previous config saved to /var/cache/conftool/dbconfig/20240123-101056-marostegui.json
  • 10:10 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-master1001.eqiad.wmnet
  • 10:04 ayounsi@cumin1002: START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet
  • 10:04 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
  • 10:03 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1003.eqiad.wmnet
  • 10:03 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
  • 10:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1016.eqiad.wmnet with OS bullseye
  • 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55322 and previous config saved to /var/cache/conftool/dbconfig/20240123-100212-marostegui.json
  • 10:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55321 and previous config saved to /var/cache/conftool/dbconfig/20240123-100002-marostegui.json
  • 09:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 09:59 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet
  • 09:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 09:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55320 and previous config saved to /var/cache/conftool/dbconfig/20240123-095923-marostegui.json
  • 09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P55319 and previous config saved to /var/cache/conftool/dbconfig/20240123-094417-marostegui.json
  • 09:41 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
  • 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
  • 09:29 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
  • 09:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P55318 and previous config saved to /var/cache/conftool/dbconfig/20240123-092910-marostegui.json
  • 09:24 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.15 refs T354433
  • 09:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55317 and previous config saved to /var/cache/conftool/dbconfig/20240123-091404-marostegui.json
  • 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55316 and previous config saved to /var/cache/conftool/dbconfig/20240123-091154-marostegui.json
  • 09:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 09:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55315 and previous config saved to /var/cache/conftool/dbconfig/20240123-091132-marostegui.json
  • 09:04 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1003.eqiad.wmnet
  • 09:01 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
  • 09:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55314 and previous config saved to /var/cache/conftool/dbconfig/20240123-090104-root.json
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P55313 and previous config saved to /var/cache/conftool/dbconfig/20240123-085625-marostegui.json
  • 08:55 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992245/ https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992359/
  • 08:51 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye
  • 08:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55312 and previous config saved to /var/cache/conftool/dbconfig/20240123-084559-root.json
  • 08:44 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 08:44 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 08:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55311 and previous config saved to /var/cache/conftool/dbconfig/20240123-084301-ladsgroup.json
  • 08:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 08:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 08:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 08:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 08:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55310 and previous config saved to /var/cache/conftool/dbconfig/20240123-084244-ladsgroup.json
  • 08:41 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
  • 08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P55309 and previous config saved to /var/cache/conftool/dbconfig/20240123-084119-marostegui.json
  • 08:39 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
  • 08:37 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 08:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55308 and previous config saved to /var/cache/conftool/dbconfig/20240123-083054-root.json
  • 08:28 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992244
  • 08:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55307 and previous config saved to /var/cache/conftool/dbconfig/20240123-082738-ladsgroup.json
  • 08:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55306 and previous config saved to /var/cache/conftool/dbconfig/20240123-082613-marostegui.json
  • 08:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55305 and previous config saved to /var/cache/conftool/dbconfig/20240123-082402-marostegui.json
  • 08:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T354336)', diff saved to https://phabricator.wikimedia.org/P55304 and previous config saved to /var/cache/conftool/dbconfig/20240123-082340-marostegui.json
  • 08:15 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55303 and previous config saved to /var/cache/conftool/dbconfig/20240123-081549-root.json
  • 08:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55302 and previous config saved to /var/cache/conftool/dbconfig/20240123-081231-ladsgroup.json
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P55301 and previous config saved to /var/cache/conftool/dbconfig/20240123-080834-marostegui.json
  • 08:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2051.codfw.wmnet
  • 08:00 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55300 and previous config saved to /var/cache/conftool/dbconfig/20240123-080044-root.json
  • 07:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2051.codfw.wmnet
  • 07:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55299 and previous config saved to /var/cache/conftool/dbconfig/20240123-075725-ladsgroup.json
  • 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1051.eqiad.wmnet
  • 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P55298 and previous config saved to /var/cache/conftool/dbconfig/20240123-075327-marostegui.json
  • 07:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1051.eqiad.wmnet
  • 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55297 and previous config saved to /var/cache/conftool/dbconfig/20240123-074538-root.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T354336)', diff saved to https://phabricator.wikimedia.org/P55296 and previous config saved to /var/cache/conftool/dbconfig/20240123-073821-marostegui.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T354336)', diff saved to https://phabricator.wikimedia.org/P55295 and previous config saved to /var/cache/conftool/dbconfig/20240123-073610-marostegui.json
  • 07:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 07:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T354336)', diff saved to https://phabricator.wikimedia.org/P55294 and previous config saved to /var/cache/conftool/dbconfig/20240123-073548-marostegui.json
  • 07:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1231.eqiad.wmnet with OS bookworm
  • 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55293 and previous config saved to /var/cache/conftool/dbconfig/20240123-073033-root.json
  • 07:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P55292 and previous config saved to /var/cache/conftool/dbconfig/20240123-073021-ladsgroup.json
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P55291 and previous config saved to /var/cache/conftool/dbconfig/20240123-072041-marostegui.json
  • 07:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55290 and previous config saved to /var/cache/conftool/dbconfig/20240123-071515-ladsgroup.json
  • 07:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
  • 07:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
  • 07:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P55289 and previous config saved to /var/cache/conftool/dbconfig/20240123-070535-marostegui.json
  • 07:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55288 and previous config saved to /var/cache/conftool/dbconfig/20240123-070008-ladsgroup.json
  • 06:57 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1231.eqiad.wmnet with OS bookworm
  • 06:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1231', diff saved to https://phabricator.wikimedia.org/P55287 and previous config saved to /var/cache/conftool/dbconfig/20240123-065606-marostegui.json
  • 06:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T354336)', diff saved to https://phabricator.wikimedia.org/P55285 and previous config saved to /var/cache/conftool/dbconfig/20240123-065029-marostegui.json
  • 06:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T354336)', diff saved to https://phabricator.wikimedia.org/P55284 and previous config saved to /var/cache/conftool/dbconfig/20240123-064819-marostegui.json
  • 06:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 06:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T354336)', diff saved to https://phabricator.wikimedia.org/P55283 and previous config saved to /var/cache/conftool/dbconfig/20240123-064757-marostegui.json
  • 06:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P55282 and previous config saved to /var/cache/conftool/dbconfig/20240123-064502-ladsgroup.json
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P55281 and previous config saved to /var/cache/conftool/dbconfig/20240123-063250-marostegui.json
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P55280 and previous config saved to /var/cache/conftool/dbconfig/20240123-061744-marostegui.json
  • 06:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T354336)', diff saved to https://phabricator.wikimedia.org/P55279 and previous config saved to /var/cache/conftool/dbconfig/20240123-060237-marostegui.json
  • 06:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T354336)', diff saved to https://phabricator.wikimedia.org/P55278 and previous config saved to /var/cache/conftool/dbconfig/20240123-060127-marostegui.json
  • 06:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 06:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 06:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 06:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 05:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 05:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 04:54 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.15 refs T354433 (duration: 51m 22s)
  • 04:02 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.15 refs T354433
  • 01:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55277 and previous config saved to /var/cache/conftool/dbconfig/20240123-011434-ladsgroup.json
  • 01:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 01:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 00:58 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=ruwikinews --fix # T350889
  • 00:57 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=fiwikinews --fix # T350889
  • 00:57 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=fiwiki --fix # T350889
  • 00:56 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=enwiki --fix # T350889
  • 00:55 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=cywiki --fix # T350889
  • 00:42 zabe: running 'zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=viwiki --fix' in screen
  • 00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P55276 and previous config saved to /var/cache/conftool/dbconfig/20240123-003338-ladsgroup.json
  • 00:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 00:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P55275 and previous config saved to /var/cache/conftool/dbconfig/20240123-003316-ladsgroup.json
  • 00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55274 and previous config saved to /var/cache/conftool/dbconfig/20240123-001810-ladsgroup.json
  • 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55273 and previous config saved to /var/cache/conftool/dbconfig/20240123-000303-ladsgroup.json

2024-01-22

  • 23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P55272 and previous config saved to /var/cache/conftool/dbconfig/20240122-234757-ladsgroup.json
  • 23:14 zabe@deploy2002: Finished scap: Backport for Stop setting wgShowIPinHeader (T355479), beta: Start reading from af_user(_text)/afh_user(_text) (T355616) (duration: 07m 31s)
  • 23:08 zabe@deploy2002: zabe: Continuing with sync
  • 23:08 zabe@deploy2002: zabe: Backport for Stop setting wgShowIPinHeader (T355479), beta: Start reading from af_user(_text)/afh_user(_text) (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:06 zabe@deploy2002: Started scap: Backport for Stop setting wgShowIPinHeader (T355479), beta: Start reading from af_user(_text)/afh_user(_text) (T355616)
  • 22:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 22:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 22:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T354336)', diff saved to https://phabricator.wikimedia.org/P55271 and previous config saved to /var/cache/conftool/dbconfig/20240122-225618-marostegui.json
  • 22:47 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088']
  • 22:47 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088']
  • 22:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P55270 and previous config saved to /var/cache/conftool/dbconfig/20240122-224111-marostegui.json
  • 22:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P55269 and previous config saved to /var/cache/conftool/dbconfig/20240122-222605-marostegui.json
  • 22:24 maryum: Deployed patch for T355538
  • 22:14 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
  • 22:14 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
  • 22:13 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
  • 22:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T354336)', diff saved to https://phabricator.wikimedia.org/P55268 and previous config saved to /var/cache/conftool/dbconfig/20240122-221058-marostegui.json
  • 22:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T354336)', diff saved to https://phabricator.wikimedia.org/P55267 and previous config saved to /var/cache/conftool/dbconfig/20240122-220850-marostegui.json
  • 22:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 22:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 22:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 22:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 22:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T354336)', diff saved to https://phabricator.wikimedia.org/P55266 and previous config saved to /var/cache/conftool/dbconfig/20240122-220811-marostegui.json
  • 21:56 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
  • 21:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P55265 and previous config saved to /var/cache/conftool/dbconfig/20240122-215305-marostegui.json
  • 21:53 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
  • 21:51 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:51 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add cloudrabbit1003 cloud-private address - taavi@cumin1002"
  • 21:50 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add cloudrabbit1003 cloud-private address - taavi@cumin1002"
  • 21:48 taavi@cumin1002: START - Cookbook sre.dns.netbox
  • 21:46 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set cloudrabbit1003 as active - taavi@cumin1002"
  • 21:45 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set cloudrabbit1003 as active - taavi@cumin1002"
  • 21:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P55264 and previous config saved to /var/cache/conftool/dbconfig/20240122-213758-marostegui.json
  • 21:33 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
  • 21:32 taavi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
  • 21:24 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
  • 21:24 taavi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
  • 21:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T354336)', diff saved to https://phabricator.wikimedia.org/P55263 and previous config saved to /var/cache/conftool/dbconfig/20240122-212252-marostegui.json
  • 21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T354336)', diff saved to https://phabricator.wikimedia.org/P55262 and previous config saved to /var/cache/conftool/dbconfig/20240122-212144-marostegui.json
  • 21:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 21:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T354336)', diff saved to https://phabricator.wikimedia.org/P55261 and previous config saved to /var/cache/conftool/dbconfig/20240122-212122-marostegui.json
  • 21:17 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
  • 21:07 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudrabbit1003
  • 21:07 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudrabbit1003
  • 21:07 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:07 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: allocate IPs for cloudrabbit1003 - taavi@cumin1002"
  • 21:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P55260 and previous config saved to /var/cache/conftool/dbconfig/20240122-210615-marostegui.json
  • 21:05 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: allocate IPs for cloudrabbit1003 - taavi@cumin1002"
  • 21:03 taavi@cumin1002: START - Cookbook sre.dns.netbox
  • 20:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P55259 and previous config saved to /var/cache/conftool/dbconfig/20240122-205109-marostegui.json
  • 20:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T354336)', diff saved to https://phabricator.wikimedia.org/P55258 and previous config saved to /var/cache/conftool/dbconfig/20240122-203602-marostegui.json
  • 20:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1211 (T354336)', diff saved to https://phabricator.wikimedia.org/P55257 and previous config saved to /var/cache/conftool/dbconfig/20240122-203354-marostegui.json
  • 20:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 20:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 20:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T354336)', diff saved to https://phabricator.wikimedia.org/P55256 and previous config saved to /var/cache/conftool/dbconfig/20240122-203332-marostegui.json
  • 20:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P55255 and previous config saved to /var/cache/conftool/dbconfig/20240122-201826-marostegui.json
  • 20:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P55254 and previous config saved to /var/cache/conftool/dbconfig/20240122-200319-marostegui.json
  • 19:57 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 19:56 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 19:56 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 19:55 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 19:54 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 19:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 19:51 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 19:50 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 19:50 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 19:48 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 19:48 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 19:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T354336)', diff saved to https://phabricator.wikimedia.org/P55253 and previous config saved to /var/cache/conftool/dbconfig/20240122-194813-marostegui.json
  • 19:47 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T354336)', diff saved to https://phabricator.wikimedia.org/P55252 and previous config saved to /var/cache/conftool/dbconfig/20240122-194704-marostegui.json
  • 19:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 19:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 19:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T354336)', diff saved to https://phabricator.wikimedia.org/P55251 and previous config saved to /var/cache/conftool/dbconfig/20240122-194642-marostegui.json
  • 19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P55250 and previous config saved to /var/cache/conftool/dbconfig/20240122-193136-marostegui.json
  • 19:28 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 19:28 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 19:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P55249 and previous config saved to /var/cache/conftool/dbconfig/20240122-191629-marostegui.json
  • 19:06 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
  • 19:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T354336)', diff saved to https://phabricator.wikimedia.org/P55248 and previous config saved to /var/cache/conftool/dbconfig/20240122-190123-marostegui.json
  • 19:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1193 (T354336)', diff saved to https://phabricator.wikimedia.org/P55247 and previous config saved to /var/cache/conftool/dbconfig/20240122-190014-marostegui.json
  • 19:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 19:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T354336)', diff saved to https://phabricator.wikimedia.org/P55246 and previous config saved to /var/cache/conftool/dbconfig/20240122-185952-marostegui.json
  • 18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P55245 and previous config saved to /var/cache/conftool/dbconfig/20240122-184446-marostegui.json
  • 18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P55244 and previous config saved to /var/cache/conftool/dbconfig/20240122-182939-marostegui.json
  • 18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P55243 and previous config saved to /var/cache/conftool/dbconfig/20240122-182432-ladsgroup.json
  • 18:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 18:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55242 and previous config saved to /var/cache/conftool/dbconfig/20240122-182359-ladsgroup.json
  • 18:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T354336)', diff saved to https://phabricator.wikimedia.org/P55241 and previous config saved to /var/cache/conftool/dbconfig/20240122-181433-marostegui.json
  • 18:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T354336)', diff saved to https://phabricator.wikimedia.org/P55240 and previous config saved to /var/cache/conftool/dbconfig/20240122-181324-marostegui.json
  • 18:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 18:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 18:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T354336)', diff saved to https://phabricator.wikimedia.org/P55239 and previous config saved to /var/cache/conftool/dbconfig/20240122-181302-marostegui.json
  • 18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55238 and previous config saved to /var/cache/conftool/dbconfig/20240122-180853-ladsgroup.json
  • 17:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P55237 and previous config saved to /var/cache/conftool/dbconfig/20240122-175755-marostegui.json
  • 17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55236 and previous config saved to /var/cache/conftool/dbconfig/20240122-175346-ladsgroup.json
  • 17:46 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
  • 17:44 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
  • 17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P55235 and previous config saved to /var/cache/conftool/dbconfig/20240122-174249-marostegui.json
  • 17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55234 and previous config saved to /var/cache/conftool/dbconfig/20240122-173840-ladsgroup.json
  • 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T354336)', diff saved to https://phabricator.wikimedia.org/P55233 and previous config saved to /var/cache/conftool/dbconfig/20240122-172743-marostegui.json
  • 17:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1178 (T354336)', diff saved to https://phabricator.wikimedia.org/P55232 and previous config saved to /var/cache/conftool/dbconfig/20240122-172635-marostegui.json
  • 17:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 17:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 17:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55231 and previous config saved to /var/cache/conftool/dbconfig/20240122-172612-marostegui.json
  • 17:17 akosiaris: draining kubestage2001, uncordoning kubestage2002 to allow it to receive the pods. T355437
  • 17:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P55230 and previous config saved to /var/cache/conftool/dbconfig/20240122-171106-marostegui.json
  • 17:05 vgutierrez: restore HAProxy tune.bufsize = 16684 in cp3066 - T354424
  • 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P55229 and previous config saved to /var/cache/conftool/dbconfig/20240122-165559-marostegui.json
  • 16:53 vgutierrez: testing HAProxy tune.bufsize = 32768 in cp3066 - T354424
  • 16:46 dcausse@deploy2002: Finished deploy [airflow-dags/search@dcf08b2]: (no justification provided) (duration: 00m 31s)
  • 16:46 dcausse@deploy2002: Started deploy [airflow-dags/search@dcf08b2]: (no justification provided)
  • 16:42 Daimona: T353459 Running mwscript /home/daimona/GenerateInvitationList.php to test the script before it reaches production
  • 16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55228 and previous config saved to /var/cache/conftool/dbconfig/20240122-164053-marostegui.json
  • 16:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1495.eqiad.wmnet with OS bullseye
  • 16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55227 and previous config saved to /var/cache/conftool/dbconfig/20240122-163844-marostegui.json
  • 16:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 16:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55226 and previous config saved to /var/cache/conftool/dbconfig/20240122-163822-marostegui.json
  • 16:38 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 16:38 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 16:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55225 and previous config saved to /var/cache/conftool/dbconfig/20240122-163808-ladsgroup.json
  • 16:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 16:38 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 16:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 16:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 16:37 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 16:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 16:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P55224 and previous config saved to /var/cache/conftool/dbconfig/20240122-163729-ladsgroup.json
  • 16:31 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1486.eqiad.wmnet with OS bullseye
  • 16:29 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:29 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 16:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P55222 and previous config saved to /var/cache/conftool/dbconfig/20240122-162315-marostegui.json
  • 16:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55221 and previous config saved to /var/cache/conftool/dbconfig/20240122-162223-ladsgroup.json
  • 16:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1495.eqiad.wmnet with reason: host reimage
  • 16:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1486.eqiad.wmnet with reason: host reimage
  • 16:09 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1495.eqiad.wmnet with reason: host reimage
  • 16:08 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1486.eqiad.wmnet with reason: host reimage
  • 16:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P55220 and previous config saved to /var/cache/conftool/dbconfig/20240122-160809-marostegui.json
  • 16:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55219 and previous config saved to /var/cache/conftool/dbconfig/20240122-160716-ladsgroup.json
  • 15:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55218 and previous config saved to /var/cache/conftool/dbconfig/20240122-155607-root.json
  • 15:55 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1495.eqiad.wmnet with OS bullseye
  • 15:55 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1486.eqiad.wmnet with OS bullseye
  • 15:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55217 and previous config saved to /var/cache/conftool/dbconfig/20240122-155302-marostegui.json
  • 15:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P55216 and previous config saved to /var/cache/conftool/dbconfig/20240122-155210-ladsgroup.json
  • 15:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55215 and previous config saved to /var/cache/conftool/dbconfig/20240122-155154-marostegui.json
  • 15:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 15:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 15:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 15:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 15:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T354336)', diff saved to https://phabricator.wikimedia.org/P55214 and previous config saved to /var/cache/conftool/dbconfig/20240122-155115-marostegui.json
  • 15:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55213 and previous config saved to /var/cache/conftool/dbconfig/20240122-154102-root.json
  • 15:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P55212 and previous config saved to /var/cache/conftool/dbconfig/20240122-153608-marostegui.json
  • 15:26 sukhe: sudo cumin -b1 -s120 "A:dns-rec and not P{dns6001*}" "enable-puppet 'do not enable' && run-puppet-agent"
  • 15:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55211 and previous config saved to /var/cache/conftool/dbconfig/20240122-152557-root.json
  • 15:24 sukhe: re-enable puppet on A:dns-rec and run agent to finish merging CR 979159
  • 15:21 sukhe: enable puppet on dns6001 and run agent to test CR 979159
  • 15:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P55210 and previous config saved to /var/cache/conftool/dbconfig/20240122-152102-marostegui.json
  • 15:13 sukhe: disable Puppet on A:dns-rec to decouple anycast-hc and pdns-rec systemd binding: CR 979159
  • 15:10 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55209 and previous config saved to /var/cache/conftool/dbconfig/20240122-151052-root.json
  • 15:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T354336)', diff saved to https://phabricator.wikimedia.org/P55208 and previous config saved to /var/cache/conftool/dbconfig/20240122-150555-marostegui.json
  • 15:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T354336)', diff saved to https://phabricator.wikimedia.org/P55207 and previous config saved to /var/cache/conftool/dbconfig/20240122-150046-marostegui.json
  • 15:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 15:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 15:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 15:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 14:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55206 and previous config saved to /var/cache/conftool/dbconfig/20240122-145548-root.json
  • 14:55 hashar@deploy2002: Finished deploy [gerrit/gerrit@6257faa]: Update Zuul plugin for Gerrit 3.7 - T355521 (duration: 00m 07s)
  • 14:54 hashar@deploy2002: Started deploy [gerrit/gerrit@6257faa]: Update Zuul plugin for Gerrit 3.7 - T355521
  • 14:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 14:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 14:42 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 14:41 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:41 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 14:41 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 14:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Set ShowRollbackConfirmation in arwiki (T355213) (duration: 09m 07s)
  • 14:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55205 and previous config saved to /var/cache/conftool/dbconfig/20240122-144043-root.json
  • 14:40 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 14:40 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 14:35 logmsgbot: lucaswerkmeister-wmde@deploy2002 hubaishan and lucaswerkmeister-wmde: Continuing with sync
  • 14:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 hubaishan and lucaswerkmeister-wmde: Backport for Set ShowRollbackConfirmation in arwiki (T355213) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Set ShowRollbackConfirmation in arwiki (T355213)
  • 14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Restrict pagequality-validate right to patroller in arwikisource (T354503) (duration: 09m 41s)
  • 14:28 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1036.eqiad.wmnet to cluster eqiad and group B
  • 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1036.eqiad.wmnet to cluster eqiad and group B
  • 14:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55204 and previous config saved to /var/cache/conftool/dbconfig/20240122-142538-root.json
  • 14:25 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1134', diff saved to https://phabricator.wikimedia.org/P55203 and previous config saved to /var/cache/conftool/dbconfig/20240122-142530-marostegui.json
  • 14:24 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and hubaishan: Continuing with sync
  • 14:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and hubaishan: Backport for Restrict pagequality-validate right to patroller in arwikisource (T354503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Restrict pagequality-validate right to patroller in arwikisource (T354503)
  • 13:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1165.eqiad.wmnet with OS bookworm
  • 13:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
  • 13:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
  • 13:24 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti1036.eqiad.wmnet
  • 13:22 marostegui: Upgrade sanitarium master, there will be lag on s6 wiki replicas T354506
  • 13:21 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1165.eqiad.wmnet with OS bookworm
  • 13:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1165', diff saved to https://phabricator.wikimedia.org/P55201 and previous config saved to /var/cache/conftool/dbconfig/20240122-132023-marostegui.json
  • 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2050.codfw.wmnet
  • 13:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
  • 13:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2049.codfw.wmnet
  • 13:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1049.eqiad.wmnet
  • 13:01 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2050.codfw.wmnet
  • 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1050.eqiad.wmnet
  • 12:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1049.eqiad.wmnet
  • 12:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2049.codfw.wmnet
  • 12:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1050.eqiad.wmnet
  • 12:48 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 12:47 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55200 and previous config saved to /var/cache/conftool/dbconfig/20240122-123351-root.json
  • 12:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55199 and previous config saved to /var/cache/conftool/dbconfig/20240122-122634-marostegui.json
  • 12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55198 and previous config saved to /var/cache/conftool/dbconfig/20240122-121846-root.json
  • 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P55197 and previous config saved to /var/cache/conftool/dbconfig/20240122-121128-marostegui.json
  • 12:06 volans@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
  • 12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55195 and previous config saved to /var/cache/conftool/dbconfig/20240122-120341-root.json
  • 11:56 volans@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
  • 11:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P55193 and previous config saved to /var/cache/conftool/dbconfig/20240122-115621-marostegui.json
  • 11:56 volans@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
  • 11:56 volans@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
  • 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55192 and previous config saved to /var/cache/conftool/dbconfig/20240122-114836-root.json
  • 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55191 and previous config saved to /var/cache/conftool/dbconfig/20240122-114115-marostegui.json
  • 11:41 vgutierrez: update to HAProxy 2.8.5 on cp3066 - T354424
  • 11:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55190 and previous config saved to /var/cache/conftool/dbconfig/20240122-113331-root.json
  • 11:26 jelto: start envoy on ticket-test.wikimedia.org to test alerting - T354479
  • 11:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55189 and previous config saved to /var/cache/conftool/dbconfig/20240122-112401-marostegui.json
  • 11:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 11:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 11:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55188 and previous config saved to /var/cache/conftool/dbconfig/20240122-112339-marostegui.json
  • 11:21 jelto: stop envoy on ticket-test.wikimedia.org to test alerting - T354479
  • 11:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55187 and previous config saved to /var/cache/conftool/dbconfig/20240122-111826-root.json
  • 11:10 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2048.codfw.wmnet
  • 11:10 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1048.eqiad.wmnet
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P55185 and previous config saved to /var/cache/conftool/dbconfig/20240122-110833-marostegui.json
  • 11:04 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2048.codfw.wmnet
  • 11:04 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1048.eqiad.wmnet
  • 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55184 and previous config saved to /var/cache/conftool/dbconfig/20240122-110321-root.json
  • 11:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2158.codfw.wmnet with OS bookworm
  • 10:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P55183 and previous config saved to /var/cache/conftool/dbconfig/20240122-105326-marostegui.json
  • 10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55182 and previous config saved to /var/cache/conftool/dbconfig/20240122-105237-root.json
  • 10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55181 and previous config saved to /var/cache/conftool/dbconfig/20240122-105222-root.json
  • 10:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55180 and previous config saved to /var/cache/conftool/dbconfig/20240122-103820-marostegui.json
  • 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55179 and previous config saved to /var/cache/conftool/dbconfig/20240122-103732-root.json
  • 10:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
  • 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55178 and previous config saved to /var/cache/conftool/dbconfig/20240122-103717-root.json
  • 10:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P55177 and previous config saved to /var/cache/conftool/dbconfig/20240122-103520-ladsgroup.json
  • 10:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 10:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55176 and previous config saved to /var/cache/conftool/dbconfig/20240122-102227-root.json
  • 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55175 and previous config saved to /var/cache/conftool/dbconfig/20240122-102220-marostegui.json
  • 10:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55174 and previous config saved to /var/cache/conftool/dbconfig/20240122-102212-root.json
  • 10:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T354336)', diff saved to https://phabricator.wikimedia.org/P55173 and previous config saved to /var/cache/conftool/dbconfig/20240122-102158-marostegui.json
  • 10:18 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2158.codfw.wmnet with OS bookworm
  • 10:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2158', diff saved to https://phabricator.wikimedia.org/P55172 and previous config saved to /var/cache/conftool/dbconfig/20240122-101634-marostegui.json
  • 10:13 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gerrit[1003,2002].wikimedia.org
  • 10:13 cgoubert@cumin1002: START - Cookbook sre.hosts.remove-downtime for gerrit[1003,2002].wikimedia.org
  • 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55171 and previous config saved to /var/cache/conftool/dbconfig/20240122-100722-root.json
  • 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55170 and previous config saved to /var/cache/conftool/dbconfig/20240122-100707-root.json
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P55169 and previous config saved to /var/cache/conftool/dbconfig/20240122-100651-marostegui.json
  • 10:04 hashar: gerrit: running jgit gc on every repository to regenerate potentially faulty reachability bitmaps files preventing fetches on some repositories # T355173
  • 10:00 jelto: start envoy on ticket-test.wikimedia.org to test alerting - T354479
  • 09:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2049.codfw.wmnet
  • 09:56 jelto: stop envoy on ticket-test.wikimedia.org to test alerting - T354479
  • 09:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2049.codfw.wmnet
  • 09:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1049.eqiad.wmnet
  • 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55167 and previous config saved to /var/cache/conftool/dbconfig/20240122-095217-root.json
  • 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55166 and previous config saved to /var/cache/conftool/dbconfig/20240122-095202-root.json
  • 09:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P55165 and previous config saved to /var/cache/conftool/dbconfig/20240122-095145-marostegui.json
  • 09:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
  • 09:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
  • 09:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1049.eqiad.wmnet
  • 09:38 hashar: Restarted Gerrit with upgraded version 3.7.6 # T354885
  • 09:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55164 and previous config saved to /var/cache/conftool/dbconfig/20240122-093712-root.json
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55163 and previous config saved to /var/cache/conftool/dbconfig/20240122-093657-root.json
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T354336)', diff saved to https://phabricator.wikimedia.org/P55162 and previous config saved to /var/cache/conftool/dbconfig/20240122-093638-marostegui.json
  • 09:26 cgoubert@cumin1002: conftool action : set/pooled=no; selector: name=mw2394.codfw.wmnet
  • 09:26 cgoubert@cumin1002: conftool action : set/pooled=yes; selector: name=mw2444.codfw.wmnet
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55161 and previous config saved to /var/cache/conftool/dbconfig/20240122-092207-root.json
  • 09:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55160 and previous config saved to /var/cache/conftool/dbconfig/20240122-092152-root.json
  • 09:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T354336)', diff saved to https://phabricator.wikimedia.org/P55159 and previous config saved to /var/cache/conftool/dbconfig/20240122-091916-marostegui.json
  • 09:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 09:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 09:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 09:18 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
  • 09:18 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
  • 09:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 09:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T354336)', diff saved to https://phabricator.wikimedia.org/P55158 and previous config saved to /var/cache/conftool/dbconfig/20240122-091838-marostegui.json
  • 09:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1213.eqiad.wmnet with OS bookworm
  • 09:17 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on gerrit[1003,2002].wikimedia.org with reason: Gerrit update
  • 09:17 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on gerrit[1003,2002].wikimedia.org with reason: Gerrit update
  • 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
  • 09:11 hashar: Gerrit: reindexing all changes for 3.6 > 3.7 migration # T354885
  • 09:08 hashar@deploy2002: Finished deploy [gerrit/gerrit@bdd1a8b]: Gerrit to version 3.7.6 (duration: 00m 10s)
  • 09:08 hashar@deploy2002: Started deploy [gerrit/gerrit@bdd1a8b]: Gerrit to version 3.7.6
  • 09:06 hashar: Upgrading Gerrit # T354885
  • 09:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
  • 09:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55157 and previous config saved to /var/cache/conftool/dbconfig/20240122-090504-root.json
  • 09:03 cgoubert@cumin1002: conftool action : set/pooled=no; selector: name=mw2444.codfw.wmnet
  • 09:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P55156 and previous config saved to /var/cache/conftool/dbconfig/20240122-090332-marostegui.json
  • 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55155 and previous config saved to /var/cache/conftool/dbconfig/20240122-090218-root.json
  • 09:01 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2394.codfw.wmnet
  • 09:01 cgoubert@cumin1002: START - Cookbook sre.hosts.remove-downtime for mw2394.codfw.wmnet
  • 08:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1213.eqiad.wmnet with reason: host reimage
  • 08:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1213.eqiad.wmnet with reason: host reimage
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55154 and previous config saved to /var/cache/conftool/dbconfig/20240122-084959-root.json
  • 08:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P55153 and previous config saved to /var/cache/conftool/dbconfig/20240122-084825-marostegui.json
  • 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55152 and previous config saved to /var/cache/conftool/dbconfig/20240122-084713-root.json
  • 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2048.codfw.wmnet
  • 08:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1213.eqiad.wmnet with OS bookworm
  • 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1213:3316 db1213:3315', diff saved to https://phabricator.wikimedia.org/P55151 and previous config saved to /var/cache/conftool/dbconfig/20240122-083812-marostegui.json
  • 08:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2048.codfw.wmnet
  • 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1048.eqiad.wmnet
  • 08:35 xSavitar: UTC morning backport window done!
  • 08:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55150 and previous config saved to /var/cache/conftool/dbconfig/20240122-083454-root.json
  • 08:34 derick@deploy2002: Finished scap: Backport for wmf-config: Remove unused wgCentralAuthTokenCacheType (T336004) (duration: 18m 15s)
  • 08:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T354336)', diff saved to https://phabricator.wikimedia.org/P55149 and previous config saved to /var/cache/conftool/dbconfig/20240122-083319-marostegui.json
  • 08:32 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1048.eqiad.wmnet
  • 08:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55148 and previous config saved to /var/cache/conftool/dbconfig/20240122-083208-root.json
  • 08:27 derick@deploy2002: d3r1ck01 and derick: Continuing with sync
  • 08:26 derick@deploy2002: d3r1ck01 and derick: Backport for wmf-config: Remove unused wgCentralAuthTokenCacheType (T336004) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55147 and previous config saved to /var/cache/conftool/dbconfig/20240122-081950-root.json
  • 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55146 and previous config saved to /var/cache/conftool/dbconfig/20240122-081727-root.json
  • 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55145 and previous config saved to /var/cache/conftool/dbconfig/20240122-081703-root.json
  • 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T354336)', diff saved to https://phabricator.wikimedia.org/P55144 and previous config saved to /var/cache/conftool/dbconfig/20240122-081618-marostegui.json
  • 08:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 08:15 derick@deploy2002: Started scap: Backport for wmf-config: Remove unused wgCentralAuthTokenCacheType (T336004)
  • 08:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 08:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T354336)', diff saved to https://phabricator.wikimedia.org/P55143 and previous config saved to /var/cache/conftool/dbconfig/20240122-081545-marostegui.json
  • 08:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55142 and previous config saved to /var/cache/conftool/dbconfig/20240122-080445-root.json
  • 08:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55141 and previous config saved to /var/cache/conftool/dbconfig/20240122-080222-root.json
  • 08:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55140 and previous config saved to /var/cache/conftool/dbconfig/20240122-080158-root.json
  • 08:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P55139 and previous config saved to /var/cache/conftool/dbconfig/20240122-080038-marostegui.json
  • 07:54 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Shubhankar Patankar out of all services on: 2208 hosts
  • 07:53 root@cumin2002: START - Cookbook sre.idm.logout Logging Shubhankar Patankar out of all services on: 2208 hosts
  • 07:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55138 and previous config saved to /var/cache/conftool/dbconfig/20240122-074940-root.json
  • 07:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55137 and previous config saved to /var/cache/conftool/dbconfig/20240122-074717-root.json
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55136 and previous config saved to /var/cache/conftool/dbconfig/20240122-074653-root.json
  • 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P55135 and previous config saved to /var/cache/conftool/dbconfig/20240122-074532-marostegui.json
  • 07:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2169.codfw.wmnet with OS bookworm
  • 07:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55134 and previous config saved to /var/cache/conftool/dbconfig/20240122-073435-root.json
  • 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55133 and previous config saved to /var/cache/conftool/dbconfig/20240122-073212-root.json
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55132 and previous config saved to /var/cache/conftool/dbconfig/20240122-073148-root.json
  • 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T354336)', diff saved to https://phabricator.wikimedia.org/P55131 and previous config saved to /var/cache/conftool/dbconfig/20240122-073025-marostegui.json
  • 07:28 kart_: Updated MinT to 2024-01-22-053144-production (T355303, T338608, T353510, T354666)
  • 07:20 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
  • 07:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55130 and previous config saved to /var/cache/conftool/dbconfig/20240122-071707-root.json
  • 07:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
  • 07:13 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
  • 07:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2127 (T354336)', diff saved to https://phabricator.wikimedia.org/P55129 and previous config saved to /var/cache/conftool/dbconfig/20240122-071117-marostegui.json
  • 07:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
  • 07:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 07:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T354336)', diff saved to https://phabricator.wikimedia.org/P55128 and previous config saved to /var/cache/conftool/dbconfig/20240122-071054-marostegui.json
  • 07:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55127 and previous config saved to /var/cache/conftool/dbconfig/20240122-070202-root.json
  • 07:02 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P55126 and previous config saved to /var/cache/conftool/dbconfig/20240122-065548-marostegui.json
  • 06:55 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
  • 06:52 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2169.codfw.wmnet with OS bookworm
  • 06:52 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 06:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2169:3316 db2169:3317', diff saved to https://phabricator.wikimedia.org/P55125 and previous config saved to /var/cache/conftool/dbconfig/20240122-064929-marostegui.json
  • 06:47 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 06:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55124 and previous config saved to /var/cache/conftool/dbconfig/20240122-064657-root.json
  • 06:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1187.eqiad.wmnet with OS bookworm
  • 06:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P55123 and previous config saved to /var/cache/conftool/dbconfig/20240122-064041-marostegui.json
  • 06:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
  • 06:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T354336)', diff saved to https://phabricator.wikimedia.org/P55122 and previous config saved to /var/cache/conftool/dbconfig/20240122-062535-marostegui.json
  • 06:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
  • 06:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1187.eqiad.wmnet with OS bookworm
  • 06:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1187 T354506', diff saved to https://phabricator.wikimedia.org/P55121 and previous config saved to /var/cache/conftool/dbconfig/20240122-060811-marostegui.json
  • 06:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2109 (T354336)', diff saved to https://phabricator.wikimedia.org/P55120 and previous config saved to /var/cache/conftool/dbconfig/20240122-060529-marostegui.json
  • 06:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 06:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 06:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 05:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 05:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55119 and previous config saved to /var/cache/conftool/dbconfig/20240122-054005-ladsgroup.json
  • 05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55118 and previous config saved to /var/cache/conftool/dbconfig/20240122-052458-ladsgroup.json
  • 05:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55117 and previous config saved to /var/cache/conftool/dbconfig/20240122-050952-ladsgroup.json
  • 04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55116 and previous config saved to /var/cache/conftool/dbconfig/20240122-045445-ladsgroup.json

2024-01-21

  • 23:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55115 and previous config saved to /var/cache/conftool/dbconfig/20240121-232323-ladsgroup.json
  • 23:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 23:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 23:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55114 and previous config saved to /var/cache/conftool/dbconfig/20240121-232300-ladsgroup.json
  • 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55113 and previous config saved to /var/cache/conftool/dbconfig/20240121-230754-ladsgroup.json
  • 22:55 tgr: T355491 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=dawiki --logwiki=metawiki 'Radiocolono' 'GuaritaRM'
  • 22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55112 and previous config saved to /var/cache/conftool/dbconfig/20240121-225247-ladsgroup.json
  • 22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55111 and previous config saved to /var/cache/conftool/dbconfig/20240121-223740-ladsgroup.json
  • 17:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55110 and previous config saved to /var/cache/conftool/dbconfig/20240121-171534-ladsgroup.json
  • 17:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 17:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 17:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P55109 and previous config saved to /var/cache/conftool/dbconfig/20240121-171512-ladsgroup.json
  • 17:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P55108 and previous config saved to /var/cache/conftool/dbconfig/20240121-170005-ladsgroup.json
  • 16:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P55107 and previous config saved to /var/cache/conftool/dbconfig/20240121-164459-ladsgroup.json
  • 16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P55106 and previous config saved to /var/cache/conftool/dbconfig/20240121-162952-ladsgroup.json
  • 11:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P55105 and previous config saved to /var/cache/conftool/dbconfig/20240121-110344-ladsgroup.json
  • 11:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 11:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 11:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55104 and previous config saved to /var/cache/conftool/dbconfig/20240121-110322-ladsgroup.json
  • 10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55103 and previous config saved to /var/cache/conftool/dbconfig/20240121-104815-ladsgroup.json
  • 10:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55102 and previous config saved to /var/cache/conftool/dbconfig/20240121-103309-ladsgroup.json
  • 10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55101 and previous config saved to /var/cache/conftool/dbconfig/20240121-101802-ladsgroup.json
  • 09:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55100 and previous config saved to /var/cache/conftool/dbconfig/20240121-091731-ladsgroup.json
  • 09:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 09:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 09:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P55099 and previous config saved to /var/cache/conftool/dbconfig/20240121-091708-ladsgroup.json
  • 09:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2175', diff saved to https://phabricator.wikimedia.org/P55098 and previous config saved to /var/cache/conftool/dbconfig/20240121-090831-marostegui.json
  • 09:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55097 and previous config saved to /var/cache/conftool/dbconfig/20240121-090202-ladsgroup.json
  • 08:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55096 and previous config saved to /var/cache/conftool/dbconfig/20240121-084655-ladsgroup.json
  • 08:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P55095 and previous config saved to /var/cache/conftool/dbconfig/20240121-083148-ladsgroup.json
  • 02:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P55094 and previous config saved to /var/cache/conftool/dbconfig/20240121-024507-ladsgroup.json
  • 02:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 02:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 02:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P55093 and previous config saved to /var/cache/conftool/dbconfig/20240121-024445-ladsgroup.json
  • 02:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55092 and previous config saved to /var/cache/conftool/dbconfig/20240121-022939-ladsgroup.json
  • 02:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55091 and previous config saved to /var/cache/conftool/dbconfig/20240121-021432-ladsgroup.json
  • 01:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P55090 and previous config saved to /var/cache/conftool/dbconfig/20240121-015926-ladsgroup.json
  • 00:29 mutante: phabricator is back and on bullseye
  • 00:11 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004 (duration: 00m 13s)
  • 00:11 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004
  • 00:03 mutante: phab1004:/usr/bin# ln -s /var/lib/scap/scap/bin/scap .
  • 00:00 brennen@deploy2002: Installation of scap version "latest" completed for 1 hosts
  • 00:00 brennen@deploy2002: Installing scap version "latest" for 1 hosts

2024-01-20

  • 23:58 mutante: phab1004 - chown -R scap:scap /var/lib/scap
  • 23:10 brennen@deploy2002: Installing scap version "latest" for 1 hosts
  • 22:45 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004 (duration: 00m 10s)
  • 22:44 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004
  • 22:39 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004 (duration: 00m 10s)
  • 22:39 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004
  • 22:34 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: deployment
  • 22:34 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: deployment
  • 22:28 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert (part 2) (duration: 00m 54s)
  • 22:27 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert (part 2)
  • 22:23 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert (duration: 00m 55s)
  • 22:22 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert
  • 22:02 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
  • 22:02 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
  • 22:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab.wmfusercontent.org with reason: OS upgrade
  • 22:02 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab.wmfusercontent.org with reason: OS upgrade
  • 22:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host phab1004.eqiad.wmnet with OS bullseye
  • 22:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
  • 22:01 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
  • 21:46 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1004.eqiad.wmnet with reason: host reimage
  • 21:43 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1004.eqiad.wmnet with reason: host reimage
  • 21:33 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
  • 21:33 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
  • 21:31 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host phab1004.eqiad.wmnet with OS bullseye
  • 21:27 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host phab1004.eqiad.wmnet with OS bullseye
  • 21:27 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host phab1004.eqiad.wmnet with OS bullseye
  • 21:03 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config changes (redux) (duration: 01m 35s)
  • 21:02 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config changes (redux)
  • 20:38 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: maintenance
  • 20:38 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: maintenance
  • 20:37 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up database changes (duration: 00m 53s)
  • 20:36 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up database changes
  • 20:32 mutante: phabricator going down for maintenance
  • 20:24 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab.wmfusercontent.org with reason: OS upgrade
  • 20:23 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
  • 20:23 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
  • 20:22 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
  • 20:22 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
  • 20:04 brennen: start of phab/phorge bullseye update window - T334519
  • 20:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P55089 and previous config saved to /var/cache/conftool/dbconfig/20240120-200154-ladsgroup.json
  • 20:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 20:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 14:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 14:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P55087 and previous config saved to /var/cache/conftool/dbconfig/20240120-095311-ladsgroup.json
  • 09:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55086 and previous config saved to /var/cache/conftool/dbconfig/20240120-093804-ladsgroup.json
  • 09:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55085 and previous config saved to /var/cache/conftool/dbconfig/20240120-092257-ladsgroup.json
  • 09:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P55084 and previous config saved to /var/cache/conftool/dbconfig/20240120-090751-ladsgroup.json
  • 04:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P55083 and previous config saved to /var/cache/conftool/dbconfig/20240120-041124-ladsgroup.json
  • 04:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 04:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 04:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P55082 and previous config saved to /var/cache/conftool/dbconfig/20240120-041102-ladsgroup.json
  • 03:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55081 and previous config saved to /var/cache/conftool/dbconfig/20240120-035555-ladsgroup.json
  • 03:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55080 and previous config saved to /var/cache/conftool/dbconfig/20240120-034049-ladsgroup.json
  • 03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P55079 and previous config saved to /var/cache/conftool/dbconfig/20240120-032542-ladsgroup.json

2024-01-19

  • 22:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P55078 and previous config saved to /var/cache/conftool/dbconfig/20240119-225906-ladsgroup.json
  • 22:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 22:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 22:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P55077 and previous config saved to /var/cache/conftool/dbconfig/20240119-225844-ladsgroup.json
  • 22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55076 and previous config saved to /var/cache/conftool/dbconfig/20240119-224337-ladsgroup.json
  • 22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55075 and previous config saved to /var/cache/conftool/dbconfig/20240119-222830-ladsgroup.json
  • 22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P55074 and previous config saved to /var/cache/conftool/dbconfig/20240119-221324-ladsgroup.json
  • 22:05 ryankemper: [WDQS] Repooled `wdqs10[19,20]` (caught up on lag)
  • 20:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 20:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 20:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T354336)', diff saved to https://phabricator.wikimedia.org/P55073 and previous config saved to /var/cache/conftool/dbconfig/20240119-202129-marostegui.json
  • 20:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P55072 and previous config saved to /var/cache/conftool/dbconfig/20240119-200622-marostegui.json
  • 19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P55071 and previous config saved to /var/cache/conftool/dbconfig/20240119-195116-marostegui.json
  • 19:45 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
  • 19:43 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
  • 19:38 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
  • 19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T354336)', diff saved to https://phabricator.wikimedia.org/P55070 and previous config saved to /var/cache/conftool/dbconfig/20240119-193610-marostegui.json
  • 19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1223 (T354336)', diff saved to https://phabricator.wikimedia.org/P55069 and previous config saved to /var/cache/conftool/dbconfig/20240119-193028-marostegui.json
  • 19:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1223.eqiad.wmnet with reason: Maintenance
  • 19:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1223.eqiad.wmnet with reason: Maintenance
  • 19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T354336)', diff saved to https://phabricator.wikimedia.org/P55068 and previous config saved to /var/cache/conftool/dbconfig/20240119-193006-marostegui.json
  • 19:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P55067 and previous config saved to /var/cache/conftool/dbconfig/20240119-191459-marostegui.json
  • 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P55066 and previous config saved to /var/cache/conftool/dbconfig/20240119-185953-marostegui.json
  • 18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T354336)', diff saved to https://phabricator.wikimedia.org/P55065 and previous config saved to /var/cache/conftool/dbconfig/20240119-184446-marostegui.json
  • 18:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T354336)', diff saved to https://phabricator.wikimedia.org/P55064 and previous config saved to /var/cache/conftool/dbconfig/20240119-183902-marostegui.json
  • 18:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 18:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 18:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 18:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T354336)', diff saved to https://phabricator.wikimedia.org/P55063 and previous config saved to /var/cache/conftool/dbconfig/20240119-183821-marostegui.json
  • 18:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
  • 18:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P55062 and previous config saved to /var/cache/conftool/dbconfig/20240119-182314-marostegui.json
  • 18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P55061 and previous config saved to /var/cache/conftool/dbconfig/20240119-180808-marostegui.json
  • 18:02 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
  • 17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T354336)', diff saved to https://phabricator.wikimedia.org/P55060 and previous config saved to /var/cache/conftool/dbconfig/20240119-175301-marostegui.json
  • 17:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T354336)', diff saved to https://phabricator.wikimedia.org/P55059 and previous config saved to /var/cache/conftool/dbconfig/20240119-174735-marostegui.json
  • 17:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 17:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 17:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T354336)', diff saved to https://phabricator.wikimedia.org/P55058 and previous config saved to /var/cache/conftool/dbconfig/20240119-174713-marostegui.json
  • 17:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P55057 and previous config saved to /var/cache/conftool/dbconfig/20240119-173207-marostegui.json
  • 17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P55056 and previous config saved to /var/cache/conftool/dbconfig/20240119-172715-ladsgroup.json
  • 17:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 17:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P55055 and previous config saved to /var/cache/conftool/dbconfig/20240119-172652-ladsgroup.json
  • 17:25 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cloudelastic1010.wikimedia.org with reason: need to fix regex certs
  • 17:25 bking@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on cloudelastic1010.wikimedia.org with reason: need to fix regex certs
  • 17:23 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1010.wikimedia.org
  • 17:23 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1009.wikimedia.org
  • 17:23 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1008.wikimedia.org
  • 17:22 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1007.wikimedia.org
  • 17:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P55054 and previous config saved to /var/cache/conftool/dbconfig/20240119-171700-marostegui.json
  • 17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55053 and previous config saved to /var/cache/conftool/dbconfig/20240119-171146-ladsgroup.json
  • 17:06 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
  • 17:04 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host elastic2088.codfw.wmnet with OS bullseye
  • 17:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T354336)', diff saved to https://phabricator.wikimedia.org/P55052 and previous config saved to /var/cache/conftool/dbconfig/20240119-170154-marostegui.json
  • 16:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55051 and previous config saved to /var/cache/conftool/dbconfig/20240119-165639-ladsgroup.json
  • 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T354336)', diff saved to https://phabricator.wikimedia.org/P55050 and previous config saved to /var/cache/conftool/dbconfig/20240119-165627-marostegui.json
  • 16:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 16:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T354336)', diff saved to https://phabricator.wikimedia.org/P55049 and previous config saved to /var/cache/conftool/dbconfig/20240119-165605-marostegui.json
  • 16:41 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
  • 16:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P55048 and previous config saved to /var/cache/conftool/dbconfig/20240119-164133-ladsgroup.json
  • 16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P55047 and previous config saved to /var/cache/conftool/dbconfig/20240119-164058-marostegui.json
  • 16:38 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
  • 16:31 Emperor: mark new drive as non-RAID, mount, restore to service with puppet ms-be2072 T355330
  • 16:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P55046 and previous config saved to /var/cache/conftool/dbconfig/20240119-162552-marostegui.json
  • 16:16 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
  • 16:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T354336)', diff saved to https://phabricator.wikimedia.org/P55045 and previous config saved to /var/cache/conftool/dbconfig/20240119-161046-marostegui.json
  • 16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T354336)', diff saved to https://phabricator.wikimedia.org/P55044 and previous config saved to /var/cache/conftool/dbconfig/20240119-160521-marostegui.json
  • 16:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 16:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55043 and previous config saved to /var/cache/conftool/dbconfig/20240119-160459-marostegui.json
  • 15:57 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
  • 15:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P55042 and previous config saved to /var/cache/conftool/dbconfig/20240119-154953-marostegui.json
  • 15:46 gmodena@deploy2002: Finished deploy [airflow-dags/analytics@f32c06e]: (no justification provided) (duration: 00m 30s)
  • 15:46 gmodena@deploy2002: Started deploy [airflow-dags/analytics@f32c06e]: (no justification provided)
  • 15:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P55041 and previous config saved to /var/cache/conftool/dbconfig/20240119-153446-marostegui.json
  • 15:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55040 and previous config saved to /var/cache/conftool/dbconfig/20240119-151940-marostegui.json
  • 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55039 and previous config saved to /var/cache/conftool/dbconfig/20240119-151413-marostegui.json
  • 15:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 15:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 15:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 15:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 15:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 15:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 15:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2118.codfw.wmnet with reason: Maintenance
  • 15:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2118.codfw.wmnet with reason: Maintenance
  • 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T354336)', diff saved to https://phabricator.wikimedia.org/P55038 and previous config saved to /var/cache/conftool/dbconfig/20240119-145930-marostegui.json
  • 14:56 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
  • 14:50 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1107.eqiad.wmnet with OS bullseye
  • 14:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P55036 and previous config saved to /var/cache/conftool/dbconfig/20240119-144423-marostegui.json
  • 14:37 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1103.eqiad.wmnet with OS bullseye
  • 14:35 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
  • 14:34 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
  • 14:34 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
  • 14:34 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
  • 14:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1107.eqiad.wmnet with reason: host reimage
  • 14:31 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
  • 14:29 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1107.eqiad.wmnet with reason: host reimage
  • 14:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P55034 and previous config saved to /var/cache/conftool/dbconfig/20240119-142917-marostegui.json
  • 14:27 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:27 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:24 ejegg: payments-wiki upgraded from c37ddae5 to c2138768
  • 14:21 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:21 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1103.eqiad.wmnet with reason: host reimage
  • 14:17 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1103.eqiad.wmnet with reason: host reimage
  • 14:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T354336)', diff saved to https://phabricator.wikimedia.org/P55033 and previous config saved to /var/cache/conftool/dbconfig/20240119-141411-marostegui.json
  • 14:13 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1107.eqiad.wmnet with OS bullseye
  • 14:12 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:12 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T354336)', diff saved to https://phabricator.wikimedia.org/P55032 and previous config saved to /var/cache/conftool/dbconfig/20240119-140746-marostegui.json
  • 14:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 14:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 14:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55031 and previous config saved to /var/cache/conftool/dbconfig/20240119-140712-marostegui.json
  • 14:07 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:06 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 14:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1103.eqiad.wmnet with OS bullseye
  • 13:58 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 13:57 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P55030 and previous config saved to /var/cache/conftool/dbconfig/20240119-135206-marostegui.json
  • 13:46 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 13:46 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 13:43 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 13:38 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2046.codfw.wmnet
  • 13:38 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1046.eqiad.wmnet
  • 13:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P55029 and previous config saved to /var/cache/conftool/dbconfig/20240119-133659-marostegui.json
  • 13:32 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2046.codfw.wmnet
  • 13:32 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1046.eqiad.wmnet
  • 13:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55028 and previous config saved to /var/cache/conftool/dbconfig/20240119-132153-marostegui.json
  • 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2169:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55027 and previous config saved to /var/cache/conftool/dbconfig/20240119-131929-marostegui.json
  • 13:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 13:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55026 and previous config saved to /var/cache/conftool/dbconfig/20240119-131906-marostegui.json
  • 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P55024 and previous config saved to /var/cache/conftool/dbconfig/20240119-130400-marostegui.json
  • 12:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P55023 and previous config saved to /var/cache/conftool/dbconfig/20240119-124853-marostegui.json
  • 12:45 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 12:44 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 12:44 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 12:43 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 12:42 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 12:41 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 12:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55022 and previous config saved to /var/cache/conftool/dbconfig/20240119-123347-marostegui.json
  • 12:32 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:32 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55021 and previous config saved to /var/cache/conftool/dbconfig/20240119-123023-marostegui.json
  • 12:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 12:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 12:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T354336)', diff saved to https://phabricator.wikimedia.org/P55020 and previous config saved to /var/cache/conftool/dbconfig/20240119-123001-marostegui.json
  • 12:30 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:29 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P55019 and previous config saved to /var/cache/conftool/dbconfig/20240119-121455-marostegui.json
  • 11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P55018 and previous config saved to /var/cache/conftool/dbconfig/20240119-115948-marostegui.json
  • 11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P55017 and previous config saved to /var/cache/conftool/dbconfig/20240119-114452-ladsgroup.json
  • 11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T354336)', diff saved to https://phabricator.wikimedia.org/P55016 and previous config saved to /var/cache/conftool/dbconfig/20240119-114442-marostegui.json
  • 11:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 11:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P55015 and previous config saved to /var/cache/conftool/dbconfig/20240119-114424-ladsgroup.json
  • 11:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T354336)', diff saved to https://phabricator.wikimedia.org/P55014 and previous config saved to /var/cache/conftool/dbconfig/20240119-114219-marostegui.json
  • 11:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 11:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T354336)', diff saved to https://phabricator.wikimedia.org/P55013 and previous config saved to /var/cache/conftool/dbconfig/20240119-114140-marostegui.json
  • 11:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55012 and previous config saved to /var/cache/conftool/dbconfig/20240119-112917-ladsgroup.json
  • 11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P55011 and previous config saved to /var/cache/conftool/dbconfig/20240119-112634-marostegui.json
  • 11:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55010 and previous config saved to /var/cache/conftool/dbconfig/20240119-111411-ladsgroup.json
  • 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P55009 and previous config saved to /var/cache/conftool/dbconfig/20240119-111127-marostegui.json
  • 10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P55008 and previous config saved to /var/cache/conftool/dbconfig/20240119-105904-ladsgroup.json
  • 10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T354336)', diff saved to https://phabricator.wikimedia.org/P55007 and previous config saved to /var/cache/conftool/dbconfig/20240119-105621-marostegui.json
  • 10:45 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
  • 10:42 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
  • 10:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T354336)', diff saved to https://phabricator.wikimedia.org/P55006 and previous config saved to /var/cache/conftool/dbconfig/20240119-101340-marostegui.json
  • 10:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 10:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 10:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T354336)', diff saved to https://phabricator.wikimedia.org/P55005 and previous config saved to /var/cache/conftool/dbconfig/20240119-101318-marostegui.json
  • 09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P55004 and previous config saved to /var/cache/conftool/dbconfig/20240119-095811-marostegui.json
  • 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P55003 and previous config saved to /var/cache/conftool/dbconfig/20240119-094305-marostegui.json
  • 09:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T354336)', diff saved to https://phabricator.wikimedia.org/P55002 and previous config saved to /var/cache/conftool/dbconfig/20240119-092758-marostegui.json
  • 09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T354336)', diff saved to https://phabricator.wikimedia.org/P55001 and previous config saved to /var/cache/conftool/dbconfig/20240119-092535-marostegui.json
  • 09:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 09:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 09:25 jnuche@deploy2002: Installation of scap version "4.65.2" completed for 531 hosts
  • 09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T354336)', diff saved to https://phabricator.wikimedia.org/P55000 and previous config saved to /var/cache/conftool/dbconfig/20240119-092513-marostegui.json
  • 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore2006.codfw.wmnet
  • 09:24 jnuche@deploy2002: Installing scap version "4.65.2" for 531 hosts
  • 09:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore2006.codfw.wmnet
  • 09:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore2005.codfw.wmnet
  • 09:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P54999 and previous config saved to /var/cache/conftool/dbconfig/20240119-091007-marostegui.json
  • 09:03 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore2005.codfw.wmnet
  • 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore2004.codfw.wmnet
  • 08:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P54998 and previous config saved to /var/cache/conftool/dbconfig/20240119-085500-marostegui.json
  • 08:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore2004.codfw.wmnet
  • 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore1006.eqiad.wmnet
  • 08:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T354336)', diff saved to https://phabricator.wikimedia.org/P54997 and previous config saved to /var/cache/conftool/dbconfig/20240119-083954-marostegui.json
  • 08:39 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore1006.eqiad.wmnet
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T354336)', diff saved to https://phabricator.wikimedia.org/P54996 and previous config saved to /var/cache/conftool/dbconfig/20240119-083730-marostegui.json
  • 08:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 08:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T354336)', diff saved to https://phabricator.wikimedia.org/P54995 and previous config saved to /var/cache/conftool/dbconfig/20240119-083709-marostegui.json
  • 08:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore1005.eqiad.wmnet
  • 08:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore1005.eqiad.wmnet
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P54994 and previous config saved to /var/cache/conftool/dbconfig/20240119-082202-marostegui.json
  • 08:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore1004.eqiad.wmnet
  • 08:11 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore1004.eqiad.wmnet
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P54993 and previous config saved to /var/cache/conftool/dbconfig/20240119-080655-marostegui.json
  • 07:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 100%: T354336', diff saved to https://phabricator.wikimedia.org/P54992 and previous config saved to /var/cache/conftool/dbconfig/20240119-075828-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T354336)', diff saved to https://phabricator.wikimedia.org/P54991 and previous config saved to /var/cache/conftool/dbconfig/20240119-075149-marostegui.json
  • 07:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2120 (T354336)', diff saved to https://phabricator.wikimedia.org/P54990 and previous config saved to /var/cache/conftool/dbconfig/20240119-074825-marostegui.json
  • 07:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 07:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T354336)', diff saved to https://phabricator.wikimedia.org/P54989 and previous config saved to /var/cache/conftool/dbconfig/20240119-074752-marostegui.json
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 75%: T354336', diff saved to https://phabricator.wikimedia.org/P54988 and previous config saved to /var/cache/conftool/dbconfig/20240119-074323-root.json
  • 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P54987 and previous config saved to /var/cache/conftool/dbconfig/20240119-073245-marostegui.json
  • 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 50%: T354336', diff saved to https://phabricator.wikimedia.org/P54986 and previous config saved to /var/cache/conftool/dbconfig/20240119-072818-root.json
  • 07:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P54985 and previous config saved to /var/cache/conftool/dbconfig/20240119-071739-marostegui.json
  • 07:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 25%: T354336', diff saved to https://phabricator.wikimedia.org/P54984 and previous config saved to /var/cache/conftool/dbconfig/20240119-071313-root.json
  • 07:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T354336)', diff saved to https://phabricator.wikimedia.org/P54983 and previous config saved to /var/cache/conftool/dbconfig/20240119-070233-marostegui.json
  • 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2108 (T354336)', diff saved to https://phabricator.wikimedia.org/P54982 and previous config saved to /var/cache/conftool/dbconfig/20240119-070009-marostegui.json
  • 07:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 06:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 10%: T354336', diff saved to https://phabricator.wikimedia.org/P54981 and previous config saved to /var/cache/conftool/dbconfig/20240119-065808-root.json
  • 06:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 06:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 06:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 06:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54979 and previous config saved to /var/cache/conftool/dbconfig/20240119-063020-marostegui.json
  • 06:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 06:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 06:28 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 06:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 06:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P54978 and previous config saved to /var/cache/conftool/dbconfig/20240119-061827-ladsgroup.json
  • 06:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 06:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 06:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P54977 and previous config saved to /var/cache/conftool/dbconfig/20240119-061805-ladsgroup.json
  • 06:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P54976 and previous config saved to /var/cache/conftool/dbconfig/20240119-060258-ladsgroup.json
  • 05:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P54975 and previous config saved to /var/cache/conftool/dbconfig/20240119-054751-ladsgroup.json
  • 05:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P54974 and previous config saved to /var/cache/conftool/dbconfig/20240119-053244-ladsgroup.json
  • 03:38 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
  • 02:49 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1103.eqiad.wmnet with OS bullseye
  • 02:48 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1106.eqiad.wmnet with OS bullseye
  • 02:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1105.eqiad.wmnet with OS bullseye
  • 02:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1104.eqiad.wmnet with OS bullseye
  • 02:31 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1106.eqiad.wmnet with reason: host reimage
  • 02:28 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1106.eqiad.wmnet with reason: host reimage
  • 02:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1105.eqiad.wmnet with reason: host reimage
  • 02:24 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1105.eqiad.wmnet with reason: host reimage
  • 02:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1104.eqiad.wmnet with reason: host reimage
  • 02:21 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1104.eqiad.wmnet with reason: host reimage
  • 02:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
  • 02:17 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
  • 02:12 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1106.eqiad.wmnet with OS bullseye
  • 02:09 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1105.eqiad.wmnet with OS bullseye
  • 02:09 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
  • 02:06 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1104.eqiad.wmnet with OS bullseye
  • 02:01 tzatziki: removing 4 files for legal compliance
  • 01:42 tzatziki: removing 3 files for legal compliance
  • 01:28 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1103.eqiad.wmnet with OS bullseye
  • 01:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2097.codfw.wmnet with OS bullseye
  • 01:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2096.codfw.wmnet with OS bullseye
  • 00:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
  • 00:50 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2097.codfw.wmnet with reason: host reimage
  • 00:50 tzatziki: removing 1 file for legal compliance
  • 00:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
  • 00:47 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2097.codfw.wmnet with reason: host reimage
  • 00:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2096.codfw.wmnet with reason: host reimage
  • 00:43 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2096.codfw.wmnet with reason: host reimage
  • 00:42 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2101.codfw.wmnet with OS bullseye
  • 00:40 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2100.codfw.wmnet with OS bullseye
  • 00:34 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2099.codfw.wmnet with OS bullseye
  • 00:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2097.codfw.wmnet with OS bullseye
  • 00:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P54973 and previous config saved to /var/cache/conftool/dbconfig/20240119-002755-ladsgroup.json
  • 00:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 00:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 00:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T352010)', diff saved to https://phabricator.wikimedia.org/P54972 and previous config saved to /var/cache/conftool/dbconfig/20240119-002733-ladsgroup.json
  • 00:26 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2096.codfw.wmnet with OS bullseye
  • 00:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2098.codfw.wmnet with OS bullseye
  • 00:25 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2101.codfw.wmnet with reason: host reimage
  • 00:22 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2100.codfw.wmnet with reason: host reimage
  • 00:21 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2101.codfw.wmnet with reason: host reimage
  • 00:18 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2100.codfw.wmnet with reason: host reimage
  • 00:17 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2099.codfw.wmnet with reason: host reimage
  • 00:14 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2099.codfw.wmnet with reason: host reimage
  • 00:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1020.eqiad.wmnet with reason: needs to catch up from its lag
  • 00:13 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1020.eqiad.wmnet with reason: needs to catch up from its lag
  • 00:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P54971 and previous config saved to /var/cache/conftool/dbconfig/20240119-001226-ladsgroup.json
  • 00:12 inflatador: bking@wdqs1020 depool host to catch up on lag
  • 00:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2098.codfw.wmnet with reason: host reimage
  • 00:05 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2098.codfw.wmnet with reason: host reimage
  • 00:05 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2101.codfw.wmnet with OS bullseye
  • 00:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2100.codfw.wmnet with OS bullseye

2024-01-18

  • 23:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2099.codfw.wmnet with OS bullseye
  • 23:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P54970 and previous config saved to /var/cache/conftool/dbconfig/20240118-235720-ladsgroup.json
  • 23:50 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
  • 23:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2098.codfw.wmnet with OS bullseye
  • 23:47 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
  • 23:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T352010)', diff saved to https://phabricator.wikimedia.org/P54969 and previous config saved to /var/cache/conftool/dbconfig/20240118-234213-ladsgroup.json
  • 23:13 tstarling@deploy2002: Synchronized php-1.42.0-wmf.14/extensions/CodeMirror/resources/mode/mediawiki/mediawiki.less: fix CodeMirror style bug T355290 (duration: 06m 33s)
  • 22:59 bking@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host elastic2086.codfw.wmnet
  • 22:55 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host elastic2086.codfw.wmnet
  • 22:55 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host elastic2086*
  • 22:54 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host elastic2086*
  • 22:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
  • 22:00 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
  • 21:59 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
  • 21:57 urbanecm@deploy2002: Finished scap: Backport for Use BetaFeatures::isFeatureEnabled instead of getOption (T354288) (duration: 06m 58s)
  • 21:50 urbanecm@deploy2002: Started scap: Backport for Use BetaFeatures::isFeatureEnabled instead of getOption (T354288)
  • 21:41 jforrester@deploy2002: Finished scap: Backport for Promote wikimaniawiki to Vector 2022 as default skin (T355297) (duration: 07m 33s)
  • 21:35 jforrester@deploy2002: jforrester and msz2001: Continuing with sync
  • 21:35 jforrester@deploy2002: jforrester and msz2001: Backport for Promote wikimaniawiki to Vector 2022 as default skin (T355297) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:34 jforrester@deploy2002: Started scap: Backport for Promote wikimaniawiki to Vector 2022 as default skin (T355297)
  • 21:15 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt`
  • 21:14 dreamyjazz@deploy2002: Finished scap: Backport for Log to statsd HTTP status codes and reduce logstash log levels (T355216) (duration: 09m 00s)
  • 21:14 Dreamy_Jazz: Stopped MediaModeration scanning script (T351400)
  • 21:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 21:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 21:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T354336)', diff saved to https://phabricator.wikimedia.org/P54968 and previous config saved to /var/cache/conftool/dbconfig/20240118-211337-marostegui.json
  • 21:08 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 21:08 dreamyjazz@deploy2002: dreamyjazz: Backport for Log to statsd HTTP status codes and reduce logstash log levels (T355216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:05 dreamyjazz@deploy2002: Started scap: Backport for Log to statsd HTTP status codes and reduce logstash log levels (T355216)
  • 21:04 ejegg: payments-wiki upgraded from e38b24f0 to c37ddae5
  • 20:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P54967 and previous config saved to /var/cache/conftool/dbconfig/20240118-205830-marostegui.json
  • 20:44 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
  • 20:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P54966 and previous config saved to /var/cache/conftool/dbconfig/20240118-204324-marostegui.json
  • 20:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T354336)', diff saved to https://phabricator.wikimedia.org/P54965 and previous config saved to /var/cache/conftool/dbconfig/20240118-202817-marostegui.json
  • 20:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T354336)', diff saved to https://phabricator.wikimedia.org/P54964 and previous config saved to /var/cache/conftool/dbconfig/20240118-202606-marostegui.json
  • 20:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 20:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54963 and previous config saved to /var/cache/conftool/dbconfig/20240118-202544-marostegui.json
  • 20:24 mutante: rsyncing phab repo data, gitlab2003 pulls from phab2002 (inactive server) - test only to see how long it will take, can be stopped
  • 20:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P54962 and previous config saved to /var/cache/conftool/dbconfig/20240118-201037-marostegui.json
  • 20:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2095.codfw.wmnet with OS bullseye
  • 19:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P54961 and previous config saved to /var/cache/conftool/dbconfig/20240118-195531-marostegui.json
  • 19:48 ryankemper: T354662 Running `sudo -i authdns-update` on `dns1004` following merge of https://gerrit.wikimedia.org/r/c/operations/dns/+/991429
  • 19:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2095.codfw.wmnet with reason: host reimage
  • 19:43 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2095.codfw.wmnet with reason: host reimage
  • 19:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54960 and previous config saved to /var/cache/conftool/dbconfig/20240118-194024-marostegui.json
  • 19:26 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2095.codfw.wmnet with OS bullseye
  • 19:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2093.codfw.wmnet with OS bullseye
  • 19:23 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
  • 19:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2092.codfw.wmnet with OS bullseye
  • 19:11 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2091.codfw.wmnet with OS bullseye
  • 19:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2093.codfw.wmnet with reason: host reimage
  • 19:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2089.codfw.wmnet with OS bullseye
  • 19:04 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2093.codfw.wmnet with reason: host reimage
  • 19:02 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2092.codfw.wmnet with reason: host reimage
  • 18:59 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2092.codfw.wmnet with reason: host reimage
  • 18:54 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2091.codfw.wmnet with reason: host reimage
  • 18:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2091.codfw.wmnet with reason: host reimage
  • 18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1238 (T352010)', diff saved to https://phabricator.wikimedia.org/P54959 and previous config saved to /var/cache/conftool/dbconfig/20240118-185038-ladsgroup.json
  • 18:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 18:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P54958 and previous config saved to /var/cache/conftool/dbconfig/20240118-185016-ladsgroup.json
  • 18:48 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2089.codfw.wmnet with reason: host reimage
  • 18:47 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2093.codfw.wmnet with OS bullseye
  • 18:45 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2089.codfw.wmnet with reason: host reimage
  • 18:42 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2092.codfw.wmnet with OS bullseye
  • 18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54957 and previous config saved to /var/cache/conftool/dbconfig/20240118-184002-marostegui.json
  • 18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 18:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T354336)', diff saved to https://phabricator.wikimedia.org/P54956 and previous config saved to /var/cache/conftool/dbconfig/20240118-183940-marostegui.json
  • 18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P54955 and previous config saved to /var/cache/conftool/dbconfig/20240118-183510-ladsgroup.json
  • 18:34 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2091.codfw.wmnet with OS bullseye
  • 18:28 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2089.codfw.wmnet with OS bullseye
  • 18:25 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
  • 18:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P54954 and previous config saved to /var/cache/conftool/dbconfig/20240118-182433-marostegui.json
  • 18:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P54953 and previous config saved to /var/cache/conftool/dbconfig/20240118-182003-ladsgroup.json
  • 18:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P54951 and previous config saved to /var/cache/conftool/dbconfig/20240118-180927-marostegui.json
  • 18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P54950 and previous config saved to /var/cache/conftool/dbconfig/20240118-180456-ladsgroup.json
  • 17:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T354336)', diff saved to https://phabricator.wikimedia.org/P54949 and previous config saved to /var/cache/conftool/dbconfig/20240118-175420-marostegui.json
  • 17:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T354336)', diff saved to https://phabricator.wikimedia.org/P54948 and previous config saved to /var/cache/conftool/dbconfig/20240118-175209-marostegui.json
  • 17:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 17:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T354336)', diff saved to https://phabricator.wikimedia.org/P54947 and previous config saved to /var/cache/conftool/dbconfig/20240118-175147-marostegui.json
  • 17:43 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2097.codfw.wmnet with OS bullseye
  • 17:42 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2101.codfw.wmnet with OS bullseye
  • 17:39 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2096.codfw.wmnet with OS bullseye
  • 17:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P54946 and previous config saved to /var/cache/conftool/dbconfig/20240118-173640-marostegui.json
  • 17:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2095.codfw.wmnet with OS bullseye
  • 17:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2100.codfw.wmnet with OS bullseye
  • 17:33 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
  • 17:31 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2102.codfw.wmnet with OS bullseye
  • 17:30 topranks: Re-enabling PyBal on lvs2011 after network migration T352912
  • 17:30 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2093.codfw.wmnet with OS bullseye
  • 17:28 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2099.codfw.wmnet with OS bullseye
  • 17:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2092.codfw.wmnet with OS bullseye
  • 17:25 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2091.codfw.wmnet with OS bullseye
  • 17:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P54945 and previous config saved to /var/cache/conftool/dbconfig/20240118-172134-marostegui.json
  • 17:20 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2098.codfw.wmnet with OS bullseye
  • 17:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2102.codfw.wmnet with reason: host reimage
  • 17:11 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2102.codfw.wmnet with reason: host reimage
  • 17:11 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2089.codfw.wmnet with OS bullseye
  • 17:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T354336)', diff saved to https://phabricator.wikimedia.org/P54944 and previous config saved to /var/cache/conftool/dbconfig/20240118-170627-marostegui.json
  • 17:06 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
  • 17:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T354336)', diff saved to https://phabricator.wikimedia.org/P54943 and previous config saved to /var/cache/conftool/dbconfig/20240118-170417-marostegui.json
  • 17:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 17:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T354336)', diff saved to https://phabricator.wikimedia.org/P54942 and previous config saved to /var/cache/conftool/dbconfig/20240118-170355-marostegui.json
  • 16:54 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2102.codfw.wmnet with OS bullseye
  • 16:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2101.codfw.wmnet with OS bullseye
  • 16:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P54941 and previous config saved to /var/cache/conftool/dbconfig/20240118-164848-marostegui.json
  • 16:42 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2100.codfw.wmnet with OS bullseye
  • 16:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2090.codfw.wmnet with OS bullseye
  • 16:35 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2099.codfw.wmnet with OS bullseye
  • 16:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P54940 and previous config saved to /var/cache/conftool/dbconfig/20240118-163342-marostegui.json
  • 16:33 hashar@deploy2002: Finished deploy [integration/docroot@1d9323f]: Remove Wikimedia Design Style Guide from the list - T347895 (duration: 00m 07s)
  • 16:33 hashar@deploy2002: Started deploy [integration/docroot@1d9323f]: Remove Wikimedia Design Style Guide from the list - T347895
  • 16:27 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2098.codfw.wmnet with OS bullseye
  • 16:25 sukhe: running authdns-update for T355308
  • 16:22 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2097.codfw.wmnet with OS bullseye
  • 16:18 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
  • 16:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T354336)', diff saved to https://phabricator.wikimedia.org/P54939 and previous config saved to /var/cache/conftool/dbconfig/20240118-161834-marostegui.json
  • 16:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2096.codfw.wmnet with OS bullseye
  • 16:18 claime: Running puppet on 'P{P:kubernetes::node} and not P{F:lldp.parent ~ lsw}' - T352883
  • 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T354336)', diff saved to https://phabricator.wikimedia.org/P54938 and previous config saved to /var/cache/conftool/dbconfig/20240118-161624-marostegui.json
  • 16:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 16:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T354336)', diff saved to https://phabricator.wikimedia.org/P54937 and previous config saved to /var/cache/conftool/dbconfig/20240118-161602-marostegui.json
  • 16:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
  • 16:15 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2095.codfw.wmnet with OS bullseye
  • 16:12 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
  • 16:09 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2093.codfw.wmnet with OS bullseye
  • 16:06 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2092.codfw.wmnet with OS bullseye
  • 16:06 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: moving lvs2011 network link T352912
  • 16:06 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: moving lvs2011 network link T352912
  • 16:06 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cr2-codfw,cr[1-2]-codfw IPv6,re0.cr1-codfw.mgmt,re0.cr2-codfw.mgmt cr1-codfw with reason: moving lvs2011 network link T352912
  • 16:05 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr2-codfw,cr[1-2]-codfw IPv6,re0.cr1-codfw.mgmt,re0.cr2-codfw.mgmt cr1-codfw with reason: moving lvs2011 network link T352912
  • 16:04 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: moving lvs2011 network link T352912
  • 16:04 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2011.codfw.wmnet with reason: moving lvs2011 network link T352912
  • 16:04 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2091.codfw.wmnet with OS bullseye
  • 16:03 claime: Running puppet on 'P{P:kubernetes::node} and P{F:lldp.parent ~ lsw}' - T352883
  • 16:02 topranks: disabling PyBal and puppet on lvs2011, moving traffic to lvs2014 ahead of network change T352912
  • 16:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P54936 and previous config saved to /var/cache/conftool/dbconfig/20240118-160055-marostegui.json
  • 15:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1461.eqiad.wmnet with OS bullseye
  • 15:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2090.codfw.wmnet with OS bullseye
  • 15:56 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1439.eqiad.wmnet with OS bullseye
  • 15:54 claime: Running puppet on A:wikikube-staging-worker - T352883
  • 15:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1469.eqiad.wmnet with OS bullseye
  • 15:52 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1045.eqiad.wmnet
  • 15:52 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2045.codfw.wmnet
  • 15:52 claime: Running puppet on kubestage2002 - T352883
  • 15:52 claime: stopping puppet on P:kubernetes::node to deploy 980927 - T352883
  • 15:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2089.codfw.wmnet with OS bullseye
  • 15:49 claime: Running puppet on kubestage2002 - T352893
  • 15:46 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1045.eqiad.wmnet
  • 15:46 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2045.codfw.wmnet
  • 15:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P54935 and previous config saved to /var/cache/conftool/dbconfig/20240118-154549-marostegui.json
  • 15:45 claime: stopping puppet on P:kubernetes::node to deploy 980927 - T352893
  • 15:45 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
  • 15:40 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1461.eqiad.wmnet with reason: host reimage
  • 15:37 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1439.eqiad.wmnet with reason: host reimage
  • 15:35 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1469.eqiad.wmnet with reason: host reimage
  • 15:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1461.eqiad.wmnet with reason: host reimage
  • 15:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1439.eqiad.wmnet with reason: host reimage
  • 15:31 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1469.eqiad.wmnet with reason: host reimage
  • 15:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T354336)', diff saved to https://phabricator.wikimedia.org/P54933 and previous config saved to /var/cache/conftool/dbconfig/20240118-153042-marostegui.json
  • 15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T354336)', diff saved to https://phabricator.wikimedia.org/P54932 and previous config saved to /var/cache/conftool/dbconfig/20240118-152832-marostegui.json
  • 15:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 15:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 15:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 15:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P54931 and previous config saved to /var/cache/conftool/dbconfig/20240118-152747-marostegui.json
  • 15:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: T355313', diff saved to https://phabricator.wikimedia.org/P54930 and previous config saved to /var/cache/conftool/dbconfig/20240118-152006-root.json
  • 15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1439.eqiad.wmnet with OS bullseye
  • 15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1469.eqiad.wmnet with OS bullseye
  • 15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1461.eqiad.wmnet with OS bullseye
  • 15:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P54929 and previous config saved to /var/cache/conftool/dbconfig/20240118-151241-marostegui.json
  • 15:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: T355313', diff saved to https://phabricator.wikimedia.org/P54928 and previous config saved to /var/cache/conftool/dbconfig/20240118-150501-root.json
  • 14:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P54927 and previous config saved to /var/cache/conftool/dbconfig/20240118-145734-marostegui.json
  • 14:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: T355313', diff saved to https://phabricator.wikimedia.org/P54926 and previous config saved to /var/cache/conftool/dbconfig/20240118-144956-root.json
  • 14:43 Dreamy_Jazz: Afternoon UTC backport window done
  • 14:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P54925 and previous config saved to /var/cache/conftool/dbconfig/20240118-144228-marostegui.json
  • 14:42 Emperor: disable puppet on ms-be2072 to try and deal with faulty drive
  • 14:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P54924 and previous config saved to /var/cache/conftool/dbconfig/20240118-144214-marostegui.json
  • 14:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 14:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54923 and previous config saved to /var/cache/conftool/dbconfig/20240118-144152-marostegui.json
  • 14:41 Dreamy_Jazz: Ran `echo 'https://en.wikipedia.org/static/images/mobile/copyright/wikipedia-tagline-th.svg' | mwscript purgeList.php`, `echo 'https://en.wikipedia.org/static/images/mobile/copyright/wikipedia-wordmark-th.svg' | mwscript purgeList.php`, `echo 'https://en.wikipedia.org/static/images/project-logos/thwiki.png' | mwscript purgeList.php`, `echo 'https://en.wikipedia.org/static/images/project-logos/thwiki-1.5x.png' | mwscript purgeList.php`, and `echo 'https://en.wikipedia.org/static/images/project-logos/thwiki-2x.png' | mwscript purgeList.php`
  • 14:38 dreamyjazz@deploy2002: Finished scap: Backport for thwiki: update tagline and optimise other logos (T341407) (duration: 08m 28s)
  • 14:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
  • 14:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
  • 14:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: T355313', diff saved to https://phabricator.wikimedia.org/P54922 and previous config saved to /var/cache/conftool/dbconfig/20240118-143451-root.json
  • 14:33 dreamyjazz@deploy2002: anzx and dreamyjazz: Continuing with sync
  • 14:31 dreamyjazz@deploy2002: anzx and dreamyjazz: Backport for thwiki: update tagline and optimise other logos (T341407) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:30 dreamyjazz@deploy2002: Started scap: Backport for thwiki: update tagline and optimise other logos (T341407)
  • 14:28 kartik@deploy2002: Finished scap: Backport for Set MT threshold for Punjabi Wikipedia to 97 (T347789) (duration: 10m 03s)
  • 14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P54921 and previous config saved to /var/cache/conftool/dbconfig/20240118-142646-marostegui.json
  • 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: aqs
  • 14:22 kartik@deploy2002: kartik: Continuing with sync
  • 14:19 kartik@deploy2002: kartik: Backport for Set MT threshold for Punjabi Wikipedia to 97 (T347789) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: T355313', diff saved to https://phabricator.wikimedia.org/P54920 and previous config saved to /var/cache/conftool/dbconfig/20240118-141946-root.json
  • 14:18 kartik@deploy2002: Started scap: Backport for Set MT threshold for Punjabi Wikipedia to 97 (T347789)
  • 14:12 Dreamy_Jazz: running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt`
  • 14:11 dreamyjazz@deploy2002: Finished scap: Backport for Remove RENDER_NOW from File::transform call to avoid job thumbnailing (T355309) (duration: 07m 50s)
  • 14:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P54919 and previous config saved to /var/cache/conftool/dbconfig/20240118-141139-marostegui.json
  • 14:07 Dreamy_Jazz: Stopped MediaModeration scan for commonswiki
  • 14:07 Dreamy_Jazz: stopped MediaModerations scan for group2
  • 14:06 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: aqs
  • 14:06 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 14:05 dreamyjazz@deploy2002: dreamyjazz: Backport for Remove RENDER_NOW from File::transform call to avoid job thumbnailing (T355309) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 5%: T355313', diff saved to https://phabricator.wikimedia.org/P54918 and previous config saved to /var/cache/conftool/dbconfig/20240118-140441-root.json
  • 14:03 dreamyjazz@deploy2002: Started scap: Backport for Remove RENDER_NOW from File::transform call to avoid job thumbnailing (T355309)
  • 13:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54917 and previous config saved to /var/cache/conftool/dbconfig/20240118-135633-marostegui.json
  • 13:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54916 and previous config saved to /var/cache/conftool/dbconfig/20240118-135422-marostegui.json
  • 13:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 13:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 13:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2107.codfw.wmnet with reason: Maintenance
  • 13:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2107.codfw.wmnet with reason: Maintenance
  • 13:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 1%: T355313', diff saved to https://phabricator.wikimedia.org/P54915 and previous config saved to /var/cache/conftool/dbconfig/20240118-134936-root.json
  • 13:28 moritzm: installing python-requests security updates
  • 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T354336)', diff saved to https://phabricator.wikimedia.org/P54914 and previous config saved to /var/cache/conftool/dbconfig/20240118-130451-marostegui.json
  • 12:54 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 12:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P54913 and previous config saved to /var/cache/conftool/dbconfig/20240118-125130-ladsgroup.json
  • 12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 12:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 12:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P54912 and previous config saved to /var/cache/conftool/dbconfig/20240118-125048-ladsgroup.json
  • 12:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P54911 and previous config saved to /var/cache/conftool/dbconfig/20240118-124945-marostegui.json
  • 12:41 godog: grafana restarted on grafana1002 after https://gerrit.wikimedia.org/r/c/operations/puppet/+/991573
  • 12:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P54910 and previous config saved to /var/cache/conftool/dbconfig/20240118-123541-ladsgroup.json
  • 12:35 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 12:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P54909 and previous config saved to /var/cache/conftool/dbconfig/20240118-123439-marostegui.json
  • 12:34 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 12:33 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 12:31 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 12:28 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 12:27 Dreamy_Jazz: Finished security deploy for T347742
  • 12:27 dreamyjazz@deploy2002: Finished scap: Backport for SECURITY: Use message label instead of sanitized text output for massmessage-form-page-help message (T347742) (duration: 08m 28s)
  • 12:27 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1047.eqiad.wmnet
  • 12:26 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 12:24 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2047.codfw.wmnet
  • 12:21 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 12:20 dreamyjazz@deploy2002: dreamyjazz: Backport for SECURITY: Use message label instead of sanitized text output for massmessage-form-page-help message (T347742) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P54908 and previous config saved to /var/cache/conftool/dbconfig/20240118-122035-ladsgroup.json
  • 12:20 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2047.codfw.wmnet
  • 12:20 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1047.eqiad.wmnet
  • 12:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T354336)', diff saved to https://phabricator.wikimedia.org/P54907 and previous config saved to /var/cache/conftool/dbconfig/20240118-121932-marostegui.json
  • 12:18 dreamyjazz@deploy2002: Started scap: Backport for SECURITY: Use message label instead of sanitized text output for massmessage-form-page-help message (T347742)
  • 12:17 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:17 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:16 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:16 jynus: depooled db2146, lot of lag, should be investigated later
  • 12:15 jynus@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P54906 and previous config saved to /var/cache/conftool/dbconfig/20240118-121541-jynus.json
  • 12:07 Dreamy_Jazz: Doing security deploy for T347742
  • 12:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P54905 and previous config saved to /var/cache/conftool/dbconfig/20240118-120528-ladsgroup.json
  • 11:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T354336)', diff saved to https://phabricator.wikimedia.org/P54904 and previous config saved to /var/cache/conftool/dbconfig/20240118-114551-marostegui.json
  • 11:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 11:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 11:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T354336)', diff saved to https://phabricator.wikimedia.org/P54903 and previous config saved to /var/cache/conftool/dbconfig/20240118-114528-marostegui.json
  • 11:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P54902 and previous config saved to /var/cache/conftool/dbconfig/20240118-113022-marostegui.json
  • 11:21 godog: bounce apache2 on logstash1025 / logstash1031 - T337818
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P54901 and previous config saved to /var/cache/conftool/dbconfig/20240118-111516-marostegui.json
  • 11:04 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
  • 11:01 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T354336)', diff saved to https://phabricator.wikimedia.org/P54900 and previous config saved to /var/cache/conftool/dbconfig/20240118-110009-marostegui.json
  • 10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T354336)', diff saved to https://phabricator.wikimedia.org/P54899 and previous config saved to /var/cache/conftool/dbconfig/20240118-104335-marostegui.json
  • 10:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 10:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54898 and previous config saved to /var/cache/conftool/dbconfig/20240118-104313-marostegui.json
  • 10:37 hashar@deploy2002: Finished deploy [integration/docroot@8f5aa9e]: Add Codex Icons package (duration: 00m 05s)
  • 10:36 hashar@deploy2002: Started deploy [integration/docroot@8f5aa9e]: Add Codex Icons package
  • 10:32 hashar@deploy2002: Finished deploy [integration/docroot@88f6458]: Add npm package link for Codex Design Tokens - T354310 (duration: 00m 07s)
  • 10:32 hashar@deploy2002: Started deploy [integration/docroot@88f6458]: Add npm package link for Codex Design Tokens - T354310
  • 10:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
  • 10:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P54896 and previous config saved to /var/cache/conftool/dbconfig/20240118-102806-marostegui.json
  • 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2047.codfw.wmnet
  • 10:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
  • 10:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2047.codfw.wmnet
  • 10:19 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1047.eqiad.wmnet
  • 10:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1047.eqiad.wmnet
  • 10:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P54894 and previous config saved to /var/cache/conftool/dbconfig/20240118-101300-marostegui.json
  • 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2046.codfw.wmnet
  • 10:09 Dreamy_Jazz: T351400 running on a tmux session `foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --sleep 0 --verbose 2>&1 | tee ~/scan-files-in-scan-table-group2-sleep-0-non-jobqueue.txt`
  • 10:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2046.codfw.wmnet
  • 10:01 btullis: built and published updated openjdk-11 images based on: 11.0.21-s0-20240111
  • 09:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54893 and previous config saved to /var/cache/conftool/dbconfig/20240118-095753-marostegui.json
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54892 and previous config saved to /var/cache/conftool/dbconfig/20240118-095522-marostegui.json
  • 09:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 09:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T354336)', diff saved to https://phabricator.wikimedia.org/P54891 and previous config saved to /var/cache/conftool/dbconfig/20240118-095500-marostegui.json
  • 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1046.eqiad.wmnet
  • 09:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P54890 and previous config saved to /var/cache/conftool/dbconfig/20240118-093954-marostegui.json
  • 09:30 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.14 refs T354432
  • 09:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1046.eqiad.wmnet
  • 09:25 godog: add 50G to prometheus@k8s-mlserve in codfw
  • 09:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P54889 and previous config saved to /var/cache/conftool/dbconfig/20240118-092447-marostegui.json
  • 09:15 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --sleep 0 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-0-non-jobqueue.txt`
  • 09:12 Dreamy_Jazz: stopped MediaModeration scanning script
  • 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T354336)', diff saved to https://phabricator.wikimedia.org/P54888 and previous config saved to /var/cache/conftool/dbconfig/20240118-090941-marostegui.json
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T354336)', diff saved to https://phabricator.wikimedia.org/P54887 and previous config saved to /var/cache/conftool/dbconfig/20240118-090712-marostegui.json
  • 09:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 09:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54886 and previous config saved to /var/cache/conftool/dbconfig/20240118-090649-marostegui.json
  • 08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P54885 and previous config saved to /var/cache/conftool/dbconfig/20240118-085143-marostegui.json
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P54884 and previous config saved to /var/cache/conftool/dbconfig/20240118-083636-marostegui.json
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54883 and previous config saved to /var/cache/conftool/dbconfig/20240118-082130-marostegui.json
  • 08:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54882 and previous config saved to /var/cache/conftool/dbconfig/20240118-081900-marostegui.json
  • 08:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 08:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 08:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T354336)', diff saved to https://phabricator.wikimedia.org/P54881 and previous config saved to /var/cache/conftool/dbconfig/20240118-081838-marostegui.json
  • 08:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P54880 and previous config saved to /var/cache/conftool/dbconfig/20240118-080332-marostegui.json
  • 07:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P54879 and previous config saved to /var/cache/conftool/dbconfig/20240118-074825-marostegui.json
  • 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T354336)', diff saved to https://phabricator.wikimedia.org/P54878 and previous config saved to /var/cache/conftool/dbconfig/20240118-073319-marostegui.json
  • 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T354336)', diff saved to https://phabricator.wikimedia.org/P54877 and previous config saved to /var/cache/conftool/dbconfig/20240118-073054-marostegui.json
  • 07:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 07:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 07:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T354336)', diff saved to https://phabricator.wikimedia.org/P54876 and previous config saved to /var/cache/conftool/dbconfig/20240118-073016-marostegui.json
  • 07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P54875 and previous config saved to /var/cache/conftool/dbconfig/20240118-071509-marostegui.json
  • 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P54874 and previous config saved to /var/cache/conftool/dbconfig/20240118-070003-marostegui.json
  • 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T354336)', diff saved to https://phabricator.wikimedia.org/P54873 and previous config saved to /var/cache/conftool/dbconfig/20240118-064456-marostegui.json
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T354336)', diff saved to https://phabricator.wikimedia.org/P54872 and previous config saved to /var/cache/conftool/dbconfig/20240118-064225-marostegui.json
  • 06:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 06:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T354336)', diff saved to https://phabricator.wikimedia.org/P54871 and previous config saved to /var/cache/conftool/dbconfig/20240118-064203-marostegui.json
  • 06:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P54870 and previous config saved to /var/cache/conftool/dbconfig/20240118-062657-marostegui.json
  • 06:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P54869 and previous config saved to /var/cache/conftool/dbconfig/20240118-061150-marostegui.json
  • 06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P54868 and previous config saved to /var/cache/conftool/dbconfig/20240118-061138-ladsgroup.json
  • 06:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 06:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P54867 and previous config saved to /var/cache/conftool/dbconfig/20240118-061116-ladsgroup.json
  • 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T354336)', diff saved to https://phabricator.wikimedia.org/P54866 and previous config saved to /var/cache/conftool/dbconfig/20240118-055643-marostegui.json
  • 05:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P54865 and previous config saved to /var/cache/conftool/dbconfig/20240118-055609-ladsgroup.json
  • 05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2104 (T354336)', diff saved to https://phabricator.wikimedia.org/P54864 and previous config saved to /var/cache/conftool/dbconfig/20240118-055419-marostegui.json
  • 05:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 05:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 05:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 05:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 05:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P54863 and previous config saved to /var/cache/conftool/dbconfig/20240118-054103-ladsgroup.json
  • 05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P54862 and previous config saved to /var/cache/conftool/dbconfig/20240118-052556-ladsgroup.json

2024-01-17

  • 23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P54861 and previous config saved to /var/cache/conftool/dbconfig/20240117-233655-ladsgroup.json
  • 23:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 23:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 22:01 inflatador: bking@kafka-main2001 `kafka topics --alter --topic eqiad.cirrussearch.update_pipeline.fetch_error.rc0 --partitions 5` T354595
  • 21:55 catrope@deploy2002: Finished scap: Backport for Fix text overflow in history page (T354218) (duration: 09m 39s)
  • 21:50 inflatador: bking@kafka-main2001 `kafka topics --alter --topic codfw.cirrussearch.update_pipeline.fetch_error.rc0 --partitions 5` T354595
  • 21:49 catrope@deploy2002: jdlrobson and catrope: Continuing with sync
  • 21:47 catrope@deploy2002: jdlrobson and catrope: Backport for Fix text overflow in history page (T354218) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:47 inflatador: bking@kafka-main2001 `kafka topics --alter --topic eqiad.cirrussearch.update_pipeline.update.rc0 --partitions 5` T354595
  • 21:45 catrope@deploy2002: Started scap: Backport for Fix text overflow in history page (T354218)
  • 21:43 catrope@deploy2002: Finished scap: Backport for Enable desktop history page for all mobile logged in users (T353388) (duration: 15m 15s)
  • 21:37 catrope@deploy2002: jdlrobson and catrope: Continuing with sync
  • 21:30 catrope@deploy2002: jdlrobson and catrope: Backport for Enable desktop history page for all mobile logged in users (T353388) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:28 catrope@deploy2002: Started scap: Backport for Enable desktop history page for all mobile logged in users (T353388)
  • 21:16 inflatador: bking@kafka-main1001 `kafka topics --alter --topic codfw.cirrussearch.update_pipeline.fetch_error.rc0 --partitions 5
  • 21:15 inflatador: bking@kafka-main1001 `kafka topics --alter --topic eqiad.cirrussearch.update_pipeline.update.rc0 --partitions 5` T354595
  • 21:13 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 21:13 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 21:13 inflatador: bking@kafka-main1001 `kafka topics --alter --topic codfw.cirrussearch.update_pipeline.update.rc0 --partitions 5`
  • 21:07 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 21:07 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 21:06 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 21:06 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 21:05 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 21:04 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 20:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 20:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 20:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T354336)', diff saved to https://phabricator.wikimedia.org/P54860 and previous config saved to /var/cache/conftool/dbconfig/20240117-201513-marostegui.json
  • 20:05 mutante: LDAP - added uid=dimakoushha to groups wmde and nda (T354276)
  • 20:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P54859 and previous config saved to /var/cache/conftool/dbconfig/20240117-200006-marostegui.json
  • 19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P54858 and previous config saved to /var/cache/conftool/dbconfig/20240117-194500-marostegui.json
  • 19:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T354336)', diff saved to https://phabricator.wikimedia.org/P54857 and previous config saved to /var/cache/conftool/dbconfig/20240117-192953-marostegui.json
  • 19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T354336)', diff saved to https://phabricator.wikimedia.org/P54856 and previous config saved to /var/cache/conftool/dbconfig/20240117-192737-marostegui.json
  • 19:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 19:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T354336)', diff saved to https://phabricator.wikimedia.org/P54855 and previous config saved to /var/cache/conftool/dbconfig/20240117-192715-marostegui.json
  • 19:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P54854 and previous config saved to /var/cache/conftool/dbconfig/20240117-191209-marostegui.json
  • 19:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 19:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 18:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P54853 and previous config saved to /var/cache/conftool/dbconfig/20240117-185703-marostegui.json
  • 18:54 jnuche@deploy2002: Finished scap: deploying K8s config changes from T355243 (duration: 01m 42s)
  • 18:52 jnuche@deploy2002: Started scap: deploying K8s config changes from T355243
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T354336)', diff saved to https://phabricator.wikimedia.org/P54852 and previous config saved to /var/cache/conftool/dbconfig/20240117-184156-marostegui.json
  • 18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T354336)', diff saved to https://phabricator.wikimedia.org/P54851 and previous config saved to /var/cache/conftool/dbconfig/20240117-183944-marostegui.json
  • 18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 18:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T354336)', diff saved to https://phabricator.wikimedia.org/P54850 and previous config saved to /var/cache/conftool/dbconfig/20240117-183857-marostegui.json
  • 18:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P54849 and previous config saved to /var/cache/conftool/dbconfig/20240117-182351-marostegui.json
  • 18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P54848 and previous config saved to /var/cache/conftool/dbconfig/20240117-180844-marostegui.json
  • 17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T354336)', diff saved to https://phabricator.wikimedia.org/P54847 and previous config saved to /var/cache/conftool/dbconfig/20240117-175338-marostegui.json
  • 17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1222 (T354336)', diff saved to https://phabricator.wikimedia.org/P54846 and previous config saved to /var/cache/conftool/dbconfig/20240117-175120-marostegui.json
  • 17:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 17:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T354336)', diff saved to https://phabricator.wikimedia.org/P54845 and previous config saved to /var/cache/conftool/dbconfig/20240117-175059-marostegui.json
  • 17:39 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2395.codfw.wmnet with OS bullseye
  • 17:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P54844 and previous config saved to /var/cache/conftool/dbconfig/20240117-173552-marostegui.json
  • 17:29 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2357.codfw.wmnet with OS bullseye
  • 17:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P54843 and previous config saved to /var/cache/conftool/dbconfig/20240117-172045-marostegui.json
  • 17:19 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 17:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
  • 17:19 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host grafana2001.codfw.wmnet with OS bookworm
  • 17:18 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 17:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
  • 17:13 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 17:11 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 17:08 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
  • 17:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T354336)', diff saved to https://phabricator.wikimedia.org/P54842 and previous config saved to /var/cache/conftool/dbconfig/20240117-170539-marostegui.json
  • 17:05 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
  • 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T354336)', diff saved to https://phabricator.wikimedia.org/P54841 and previous config saved to /var/cache/conftool/dbconfig/20240117-170327-marostegui.json
  • 17:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 17:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T354336)', diff saved to https://phabricator.wikimedia.org/P54840 and previous config saved to /var/cache/conftool/dbconfig/20240117-170305-marostegui.json
  • 17:02 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on grafana2001.codfw.wmnet with reason: host reimage
  • 17:00 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2395.codfw.wmnet with OS bullseye
  • 16:57 denisse@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on grafana2001.codfw.wmnet with reason: host reimage
  • 16:48 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2357.codfw.wmnet with OS bullseye
  • 16:48 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 16:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P54839 and previous config saved to /var/cache/conftool/dbconfig/20240117-164759-marostegui.json
  • 16:42 denisse@cumin2002: START - Cookbook sre.hosts.reimage for host grafana2001.codfw.wmnet with OS bookworm
  • 16:41 jforrester@deploy2002: Finished deploy [integration/docroot@f08a107]: I746134 for T354310 (duration: 00m 07s)
  • 16:40 jforrester@deploy2002: Started deploy [integration/docroot@f08a107]: I746134 for T354310
  • 16:39 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 16:39 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P54838 and previous config saved to /var/cache/conftool/dbconfig/20240117-163252-marostegui.json
  • 16:29 damilare: civicrm upgraded from 5ef5362f to d8b0c977
  • 16:25 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 16:23 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 16:23 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 16:22 kamila@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 16:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T354336)', diff saved to https://phabricator.wikimedia.org/P54837 and previous config saved to /var/cache/conftool/dbconfig/20240117-161746-marostegui.json
  • 16:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T354336)', diff saved to https://phabricator.wikimedia.org/P54836 and previous config saved to /var/cache/conftool/dbconfig/20240117-161534-marostegui.json
  • 16:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 16:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 16:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T354336)', diff saved to https://phabricator.wikimedia.org/P54835 and previous config saved to /var/cache/conftool/dbconfig/20240117-161512-marostegui.json
  • 16:14 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:13 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:13 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 16:13 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 16:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P54834 and previous config saved to /var/cache/conftool/dbconfig/20240117-160005-marostegui.json
  • 15:54 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
  • 15:54 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
  • 15:54 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
  • 15:54 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
  • 15:49 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 15:49 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 15:45 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 15:45 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P54833 and previous config saved to /var/cache/conftool/dbconfig/20240117-154459-marostegui.json
  • 15:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2045.codfw.wmnet
  • 15:38 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 15:38 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2045.codfw.wmnet
  • 15:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T354336)', diff saved to https://phabricator.wikimedia.org/P54832 and previous config saved to /var/cache/conftool/dbconfig/20240117-152953-marostegui.json
  • 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1045.eqiad.wmnet
  • 15:27 taavi: restart etherpad-lite.service on etherpad1003
  • 15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T354336)', diff saved to https://phabricator.wikimedia.org/P54831 and previous config saved to /var/cache/conftool/dbconfig/20240117-152737-marostegui.json
  • 15:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 15:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54830 and previous config saved to /var/cache/conftool/dbconfig/20240117-152715-marostegui.json
  • 15:23 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1045.eqiad.wmnet
  • 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: cache::text
  • 15:15 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 15:13 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
  • 15:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P54827 and previous config saved to /var/cache/conftool/dbconfig/20240117-151208-marostegui.json
  • 15:10 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Exclude qqq from monolingual text languages (T341409) (duration: 07m 59s)
  • 15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 15:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1044.eqiad.wmnet
  • 15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 15:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2044.codfw.wmnet
  • 15:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 15:03 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
  • 15:02 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for Exclude qqq from monolingual text languages (T341409) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:01 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Exclude qqq from monolingual text languages (T341409)
  • 14:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1044.eqiad.wmnet
  • 14:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2044.codfw.wmnet
  • 14:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P54826 and previous config saved to /var/cache/conftool/dbconfig/20240117-145702-marostegui.json
  • 14:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: cache::text
  • 14:51 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Skip tainted references test:distnodiff script to fix Wikibase CI (T354881), Only build result entries for used wbsearchentities results (T355053) (duration: 08m 28s)
  • 14:49 claime: restarted rsyslog on kubernetes2048
  • 14:45 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
  • 14:44 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for Skip tainted references test:distnodiff script to fix Wikibase CI (T354881), Only build result entries for used wbsearchentities results (T355053) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:43 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Skip tainted references test:distnodiff script to fix Wikibase CI (T354881), Only build result entries for used wbsearchentities results (T355053)
  • 14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54824 and previous config saved to /var/cache/conftool/dbconfig/20240117-144156-marostegui.json
  • 14:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54823 and previous config saved to /var/cache/conftool/dbconfig/20240117-144039-marostegui.json
  • 14:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 14:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T354336)', diff saved to https://phabricator.wikimedia.org/P54822 and previous config saved to /var/cache/conftool/dbconfig/20240117-144018-marostegui.json
  • 14:26 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
  • 14:25 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Only build result entries for used wbsearchentities results (T355053) (duration: 09m 23s)
  • 14:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P54821 and previous config saved to /var/cache/conftool/dbconfig/20240117-142511-marostegui.json
  • 14:23 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
  • 14:22 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
  • 14:22 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
  • 14:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 14:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
  • 14:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54820 and previous config saved to /var/cache/conftool/dbconfig/20240117-142015-ladsgroup.json
  • 14:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
  • 14:17 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for Only build result entries for used wbsearchentities results (T355053) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Only build result entries for used wbsearchentities results (T355053)
  • 14:16 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
  • 14:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Remove unused $wgExtraLanguageNames['qqq'] assignment (T263441) (duration: 11m 07s)
  • 14:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P54819 and previous config saved to /var/cache/conftool/dbconfig/20240117-141005-marostegui.json
  • 14:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
  • 14:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for Remove unused $wgExtraLanguageNames['qqq'] assignment (T263441) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P54818 and previous config saved to /var/cache/conftool/dbconfig/20240117-140509-ladsgroup.json
  • 14:03 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Remove unused $wgExtraLanguageNames['qqq'] assignment (T263441)
  • 13:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T354336)', diff saved to https://phabricator.wikimedia.org/P54817 and previous config saved to /var/cache/conftool/dbconfig/20240117-135459-marostegui.json
  • 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T354336)', diff saved to https://phabricator.wikimedia.org/P54816 and previous config saved to /var/cache/conftool/dbconfig/20240117-135242-marostegui.json
  • 13:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54815 and previous config saved to /var/cache/conftool/dbconfig/20240117-135158-marostegui.json
  • 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P54814 and previous config saved to /var/cache/conftool/dbconfig/20240117-135002-ladsgroup.json
  • 13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P54813 and previous config saved to /var/cache/conftool/dbconfig/20240117-133652-marostegui.json
  • 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1014.eqiad.wmnet
  • 13:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54812 and previous config saved to /var/cache/conftool/dbconfig/20240117-133456-ladsgroup.json
  • 13:34 damilare: payments-wiki upgraded from 12d8ad5b to e38b24f0
  • 13:32 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:30 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:30 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host snapshot1014.eqiad.wmnet
  • 13:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P54811 and previous config saved to /var/cache/conftool/dbconfig/20240117-132145-marostegui.json
  • 13:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2267.codfw.wmnet with OS bullseye
  • 13:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54810 and previous config saved to /var/cache/conftool/dbconfig/20240117-130639-marostegui.json
  • 13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1146:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54809 and previous config saved to /var/cache/conftool/dbconfig/20240117-130422-marostegui.json
  • 13:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 13:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 13:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 13:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 13:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 13:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2113.codfw.wmnet with reason: Maintenance
  • 12:59 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
  • 12:58 taavi: removing vlan1119 interface on lvs1018 T355115
  • 12:56 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
  • 12:47 taavi: removing vlan1119 interface on lvs1020 T355115
  • 12:38 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2267.codfw.wmnet with OS bullseye
  • 12:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T354336)', diff saved to https://phabricator.wikimedia.org/P54806 and previous config saved to /var/cache/conftool/dbconfig/20240117-122305-marostegui.json
  • 12:22 hnowlan: setting mw[2267,2282,2357,2395] inactive in advance of reimaging
  • 12:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P54805 and previous config saved to /var/cache/conftool/dbconfig/20240117-120758-marostegui.json
  • 12:06 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
  • 12:00 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
  • 12:00 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
  • 12:00 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2394.codfw.wmnet with reason: Bad DIMM
  • 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2044.codfw.wmnet
  • 12:00 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2394.codfw.wmnet with reason: Bad DIMM
  • 11:59 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=mw2394.codfw.wmnet
  • 11:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2044.codfw.wmnet
  • 11:54 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
  • 11:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P54804 and previous config saved to /var/cache/conftool/dbconfig/20240117-115252-marostegui.json
  • 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1044.eqiad.wmnet
  • 11:46 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1044.eqiad.wmnet
  • 11:40 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2044.codfw.wmnet
  • 11:40 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1044.eqiad.wmnet
  • 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: memcached
  • 11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T354336)', diff saved to https://phabricator.wikimedia.org/P54803 and previous config saved to /var/cache/conftool/dbconfig/20240117-113745-marostegui.json
  • 11:34 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: memcached
  • 11:34 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1044.eqiad.wmnet
  • 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2192 (T354336)', diff saved to https://phabricator.wikimedia.org/P54802 and previous config saved to /var/cache/conftool/dbconfig/20240117-113432-marostegui.json
  • 11:34 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2044.codfw.wmnet
  • 11:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 11:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T354336)', diff saved to https://phabricator.wikimedia.org/P54801 and previous config saved to /var/cache/conftool/dbconfig/20240117-113410-marostegui.json
  • 11:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P54800 and previous config saved to /var/cache/conftool/dbconfig/20240117-111904-marostegui.json
  • 11:09 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt`
  • 11:09 Dreamy_Jazz: stopped scanning script
  • 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P54799 and previous config saved to /var/cache/conftool/dbconfig/20240117-110357-marostegui.json
  • 10:49 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1043.eqiad.wmnet
  • 10:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T354336)', diff saved to https://phabricator.wikimedia.org/P54798 and previous config saved to /var/cache/conftool/dbconfig/20240117-104851-marostegui.json
  • 10:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T354336)', diff saved to https://phabricator.wikimedia.org/P54797 and previous config saved to /var/cache/conftool/dbconfig/20240117-104438-marostegui.json
  • 10:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 10:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 10:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54796 and previous config saved to /var/cache/conftool/dbconfig/20240117-104416-marostegui.json
  • 10:43 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1043.eqiad.wmnet
  • 10:33 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2043.codfw.wmnet
  • 10:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P54795 and previous config saved to /var/cache/conftool/dbconfig/20240117-102909-marostegui.json
  • 10:26 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2043.codfw.wmnet
  • 10:26 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:26 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2043.codfw.wmnet
  • 10:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P54793 and previous config saved to /var/cache/conftool/dbconfig/20240117-101403-marostegui.json
  • 10:12 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2043.codfw.wmnet
  • 09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54792 and previous config saved to /var/cache/conftool/dbconfig/20240117-095856-marostegui.json
  • 09:58 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:58 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:58 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54791 and previous config saved to /var/cache/conftool/dbconfig/20240117-095544-marostegui.json
  • 09:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 09:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T354336)', diff saved to https://phabricator.wikimedia.org/P54790 and previous config saved to /var/cache/conftool/dbconfig/20240117-095521-marostegui.json
  • 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1043.eqiad.wmnet
  • 09:51 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2042.codfw.wmnet
  • 09:51 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1042.eqiad.wmnet
  • 09:46 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1043.eqiad.wmnet
  • 09:45 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1042.eqiad.wmnet
  • 09:45 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2042.codfw.wmnet
  • 09:40 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1042.eqiad.wmnet
  • 09:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P54789 and previous config saved to /var/cache/conftool/dbconfig/20240117-094015-marostegui.json
  • 09:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1042.eqiad.wmnet
  • 09:35 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:30 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host mc2042.codfw.wmnet
  • 09:29 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2042.codfw.wmnet
  • 09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P54788 and previous config saved to /var/cache/conftool/dbconfig/20240117-092507-marostegui.json
  • 09:21 jnuche@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.14 refs T354432 (duration: 06m 15s)
  • 09:15 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.14 refs T354432
  • 09:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T354336)', diff saved to https://phabricator.wikimedia.org/P54787 and previous config saved to /var/cache/conftool/dbconfig/20240117-091000-marostegui.json
  • 09:08 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host mc2042.codfw.wmnet
  • 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T354336)', diff saved to https://phabricator.wikimedia.org/P54786 and previous config saved to /var/cache/conftool/dbconfig/20240117-090648-marostegui.json
  • 09:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 09:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54785 and previous config saved to /var/cache/conftool/dbconfig/20240117-090626-marostegui.json
  • 09:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2042.codfw.wmnet
  • 08:56 dcausse@deploy2002: Finished scap: Backport for enable page_rerender for all wikis (T351503) (duration: 09m 15s)
  • 08:55 moritzm: installing Python 2.7 security updates
  • 08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P54784 and previous config saved to /var/cache/conftool/dbconfig/20240117-085119-marostegui.json
  • 08:50 dcausse@deploy2002: pfischer and dcausse: Continuing with sync
  • 08:48 dcausse@deploy2002: pfischer and dcausse: Backport for enable page_rerender for all wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:46 dcausse@deploy2002: Started scap: Backport for enable page_rerender for all wikis (T351503)
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P54783 and previous config saved to /var/cache/conftool/dbconfig/20240117-083613-marostegui.json
  • 08:23 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on db2194.codfw.wmnet with reason: debugging something before T343674
  • 08:22 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on db2194.codfw.wmnet with reason: debugging something before T343674
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54782 and previous config saved to /var/cache/conftool/dbconfig/20240117-082106-marostegui.json
  • 08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54781 and previous config saved to /var/cache/conftool/dbconfig/20240117-082001-ladsgroup.json
  • 08:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 08:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
  • 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2137:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54780 and previous config saved to /var/cache/conftool/dbconfig/20240117-081754-marostegui.json
  • 08:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 08:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T354336)', diff saved to https://phabricator.wikimedia.org/P54779 and previous config saved to /var/cache/conftool/dbconfig/20240117-081731-marostegui.json
  • 08:16 moritzm: installing python-git security updates
  • 08:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P54778 and previous config saved to /var/cache/conftool/dbconfig/20240117-080225-marostegui.json
  • 07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P54777 and previous config saved to /var/cache/conftool/dbconfig/20240117-074719-marostegui.json
  • 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T354336)', diff saved to https://phabricator.wikimedia.org/P54776 and previous config saved to /var/cache/conftool/dbconfig/20240117-073212-marostegui.json
  • 07:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T354336)', diff saved to https://phabricator.wikimedia.org/P54775 and previous config saved to /var/cache/conftool/dbconfig/20240117-072902-marostegui.json
  • 07:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 07:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 07:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 07:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T354336)', diff saved to https://phabricator.wikimedia.org/P54774 and previous config saved to /var/cache/conftool/dbconfig/20240117-072824-marostegui.json
  • 07:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P54773 and previous config saved to /var/cache/conftool/dbconfig/20240117-071317-marostegui.json
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P54772 and previous config saved to /var/cache/conftool/dbconfig/20240117-065811-marostegui.json
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T354336)', diff saved to https://phabricator.wikimedia.org/P54771 and previous config saved to /var/cache/conftool/dbconfig/20240117-064304-marostegui.json
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2123 (T354336)', diff saved to https://phabricator.wikimedia.org/P54770 and previous config saved to /var/cache/conftool/dbconfig/20240117-063951-marostegui.json
  • 06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 06:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T354336)', diff saved to https://phabricator.wikimedia.org/P54769 and previous config saved to /var/cache/conftool/dbconfig/20240117-063929-marostegui.json
  • 06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P54768 and previous config saved to /var/cache/conftool/dbconfig/20240117-062422-marostegui.json
  • 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P54767 and previous config saved to /var/cache/conftool/dbconfig/20240117-060916-marostegui.json
  • 05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T354336)', diff saved to https://phabricator.wikimedia.org/P54766 and previous config saved to /var/cache/conftool/dbconfig/20240117-055409-marostegui.json
  • 05:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2111 (T354336)', diff saved to https://phabricator.wikimedia.org/P54765 and previous config saved to /var/cache/conftool/dbconfig/20240117-055056-marostegui.json
  • 05:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 05:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 05:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 05:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 05:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 05:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 03:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 03:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 03:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54764 and previous config saved to /var/cache/conftool/dbconfig/20240117-033751-ladsgroup.json
  • 03:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P54763 and previous config saved to /var/cache/conftool/dbconfig/20240117-032245-ladsgroup.json
  • 03:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P54762 and previous config saved to /var/cache/conftool/dbconfig/20240117-030738-ladsgroup.json
  • 02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54761 and previous config saved to /var/cache/conftool/dbconfig/20240117-025232-ladsgroup.json
  • 00:03 tstarling@deploy2002: Synchronized wmf-config: T344791 related cleanup (duration: 06m 22s)

2024-01-16

  • 23:55 tstarling@deploy2002: Synchronized wmf-config/CommonSettings.php: Disable wgUseSameSiteLegacyCookies T344791 (duration: 09m 19s)
  • 21:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54760 and previous config saved to /var/cache/conftool/dbconfig/20240116-214016-ladsgroup.json
  • 21:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 21:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 20:43 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2297.codfw.wmnet with OS bullseye
  • 20:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2296.codfw.wmnet with OS bullseye
  • 20:30 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2295.codfw.wmnet with OS bullseye
  • 20:26 ryankemper: T351650 Running puppet on `P:trafficserver::backend` following merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/991091
  • 20:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2294.codfw.wmnet with OS bullseye
  • 20:23 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2297.codfw.wmnet with reason: host reimage
  • 20:20 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2297.codfw.wmnet with reason: host reimage
  • 20:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2296.codfw.wmnet with reason: host reimage
  • 20:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2292.codfw.wmnet with OS bullseye
  • 20:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2293.codfw.wmnet with OS bullseye
  • 20:13 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2296.codfw.wmnet with reason: host reimage
  • 20:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2291.codfw.wmnet with OS bullseye
  • 20:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2295.codfw.wmnet with reason: host reimage
  • 20:08 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2295.codfw.wmnet with reason: host reimage
  • 20:06 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2294.codfw.wmnet with reason: host reimage
  • 20:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2297.codfw.wmnet with OS bullseye
  • 20:02 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2294.codfw.wmnet with reason: host reimage
  • 19:56 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2296.codfw.wmnet with OS bullseye
  • 19:56 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2292.codfw.wmnet with reason: host reimage
  • 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2293.codfw.wmnet with reason: host reimage
  • 19:52 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2295.codfw.wmnet with OS bullseye
  • 19:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2291.codfw.wmnet with reason: host reimage
  • 19:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1375.eqiad.wmnet with OS bullseye
  • 19:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2293.codfw.wmnet with reason: host reimage
  • 19:48 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2292.codfw.wmnet with reason: host reimage
  • 19:47 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2291.codfw.wmnet with reason: host reimage
  • 19:47 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1376.eqiad.wmnet with OS bullseye
  • 19:46 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2294.codfw.wmnet with OS bullseye
  • 19:45 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1374.eqiad.wmnet with OS bullseye
  • 19:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T354336)', diff saved to https://phabricator.wikimedia.org/P54759 and previous config saved to /var/cache/conftool/dbconfig/20240116-194509-marostegui.json
  • 19:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1360.eqiad.wmnet with OS bullseye
  • 19:32 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2293.codfw.wmnet with OS bullseye
  • 19:31 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2292.codfw.wmnet with OS bullseye
  • 19:31 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2291.codfw.wmnet with OS bullseye
  • 19:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1363.eqiad.wmnet with OS bullseye
  • 19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P54758 and previous config saved to /var/cache/conftool/dbconfig/20240116-193002-marostegui.json
  • 19:29 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1375.eqiad.wmnet with reason: host reimage
  • 19:29 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1361.eqiad.wmnet with OS bullseye
  • 19:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1362.eqiad.wmnet with OS bullseye
  • 19:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1376.eqiad.wmnet with reason: host reimage
  • 19:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1374.eqiad.wmnet with reason: host reimage
  • 19:23 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1376.eqiad.wmnet with reason: host reimage
  • 19:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1375.eqiad.wmnet with reason: host reimage
  • 19:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1374.eqiad.wmnet with reason: host reimage
  • 19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P54757 and previous config saved to /var/cache/conftool/dbconfig/20240116-191456-marostegui.json
  • 19:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1360.eqiad.wmnet with reason: host reimage
  • 19:10 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1363.eqiad.wmnet with reason: host reimage
  • 19:08 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1361.eqiad.wmnet with reason: host reimage
  • 19:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1376.eqiad.wmnet with OS bullseye
  • 19:07 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1362.eqiad.wmnet with reason: host reimage
  • 19:07 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1375.eqiad.wmnet with OS bullseye
  • 19:06 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1374.eqiad.wmnet with OS bullseye
  • 19:06 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1363.eqiad.wmnet with reason: host reimage
  • 19:05 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1362.eqiad.wmnet with reason: host reimage
  • 19:05 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1360.eqiad.wmnet with reason: host reimage
  • 19:04 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1361.eqiad.wmnet with reason: host reimage
  • 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T354336)', diff saved to https://phabricator.wikimedia.org/P54756 and previous config saved to /var/cache/conftool/dbconfig/20240116-185949-marostegui.json
  • 18:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1230 (T354336)', diff saved to https://phabricator.wikimedia.org/P54755 and previous config saved to /var/cache/conftool/dbconfig/20240116-185723-marostegui.json
  • 18:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 18:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 18:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 18:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 18:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54754 and previous config saved to /var/cache/conftool/dbconfig/20240116-185626-marostegui.json
  • 18:51 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1363.eqiad.wmnet with OS bullseye
  • 18:51 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1362.eqiad.wmnet with OS bullseye
  • 18:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1361.eqiad.wmnet with OS bullseye
  • 18:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1360.eqiad.wmnet with OS bullseye
  • 18:42 mutante: phab2002 - pulling repo data from phab1004 by running sync script created by rsync::quickdatacopy after gerrit:990247 T354221
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P54753 and previous config saved to /var/cache/conftool/dbconfig/20240116-184120-marostegui.json
  • 18:38 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --sleep 1 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-non-job-queue.txt`
  • 18:36 Dreamy_Jazz: stopped tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt`
  • 18:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P54752 and previous config saved to /var/cache/conftool/dbconfig/20240116-182613-marostegui.json
  • 18:20 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 18:19 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 18:19 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 18:19 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 18:18 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 18:18 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 18:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54751 and previous config saved to /var/cache/conftool/dbconfig/20240116-181107-marostegui.json
  • 18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54750 and previous config saved to /var/cache/conftool/dbconfig/20240116-180841-marostegui.json
  • 18:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 18:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T354336)', diff saved to https://phabricator.wikimedia.org/P54749 and previous config saved to /var/cache/conftool/dbconfig/20240116-180819-marostegui.json
  • 17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P54748 and previous config saved to /var/cache/conftool/dbconfig/20240116-175313-marostegui.json
  • 17:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P54747 and previous config saved to /var/cache/conftool/dbconfig/20240116-173806-marostegui.json
  • 17:32 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1460.eqiad.wmnet with OS bullseye
  • 17:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T354336)', diff saved to https://phabricator.wikimedia.org/P54746 and previous config saved to /var/cache/conftool/dbconfig/20240116-172300-marostegui.json
  • 17:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T354336)', diff saved to https://phabricator.wikimedia.org/P54745 and previous config saved to /var/cache/conftool/dbconfig/20240116-172032-marostegui.json
  • 17:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 17:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 17:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T354336)', diff saved to https://phabricator.wikimedia.org/P54744 and previous config saved to /var/cache/conftool/dbconfig/20240116-172011-marostegui.json
  • 17:14 topranks: Disabling puppet and PyBal on lvs2012 ahead of migration of network link to lsw1-b2-codfw T352909
  • 17:12 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1460.eqiad.wmnet with reason: host reimage
  • 17:11 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: moving lvs hosts codfw T352784 T352918
  • 17:11 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: moving lvs hosts codfw T352784 T352918
  • 17:10 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1460.eqiad.wmnet with reason: host reimage
  • 17:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P54743 and previous config saved to /var/cache/conftool/dbconfig/20240116-170503-marostegui.json
  • 16:56 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus1006.eqiad.wmnet with reason: memory upgrade
  • 16:56 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus1006.eqiad.wmnet with reason: memory upgrade
  • 16:56 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw1460.eqiad.wmnet with OS bullseye
  • 16:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P54742 and previous config saved to /var/cache/conftool/dbconfig/20240116-164957-marostegui.json
  • 16:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T354336)', diff saved to https://phabricator.wikimedia.org/P54741 and previous config saved to /var/cache/conftool/dbconfig/20240116-163449-marostegui.json
  • 16:33 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus1005.eqiad.wmnet with reason: memory upgrade
  • 16:33 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus1005.eqiad.wmnet with reason: memory upgrade
  • 16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T354336)', diff saved to https://phabricator.wikimedia.org/P54740 and previous config saved to /var/cache/conftool/dbconfig/20240116-163224-marostegui.json
  • 16:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 16:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T354336)', diff saved to https://phabricator.wikimedia.org/P54739 and previous config saved to /var/cache/conftool/dbconfig/20240116-163203-marostegui.json
  • 16:22 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab1004 for T354969 (duration: 00m 50s)
  • 16:22 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab1004 for T354969
  • 16:21 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 for T354969 (duration: 00m 27s)
  • 16:21 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 for T354969
  • 16:20 mutante: phabricator deploy is imminent
  • 16:20 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: deployment
  • 16:20 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: deployment
  • 16:20 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
  • 16:19 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
  • 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P54738 and previous config saved to /var/cache/conftool/dbconfig/20240116-161656-marostegui.json
  • 16:03 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt`
  • 16:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P54737 and previous config saved to /var/cache/conftool/dbconfig/20240116-160150-marostegui.json
  • 16:00 Dreamy_Jazz: stopped mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt
  • 15:55 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on re0.cr[1-2]-codfw.mgmt with reason: moving lvs hosts codfw T352784 T352918
  • 15:55 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on re0.cr[1-2]-codfw.mgmt with reason: moving lvs hosts codfw T352784 T352918
  • 15:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T354336)', diff saved to https://phabricator.wikimedia.org/P54736 and previous config saved to /var/cache/conftool/dbconfig/20240116-154643-marostegui.json
  • 15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T354336)', diff saved to https://phabricator.wikimedia.org/P54735 and previous config saved to /var/cache/conftool/dbconfig/20240116-154419-marostegui.json
  • 15:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 15:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T354336)', diff saved to https://phabricator.wikimedia.org/P54734 and previous config saved to /var/cache/conftool/dbconfig/20240116-154357-marostegui.json
  • 15:29 Dreamy_Jazz: T351400 running mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt
  • 15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P54733 and previous config saved to /var/cache/conftool/dbconfig/20240116-152850-marostegui.json
  • 15:28 Dreamy_Jazz: stopped mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 25 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-25.txt
  • 15:27 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr[1-2]-codfw,cr[1-2]-codfw IPv6,lvs2013 with reason: moving lvs hosts codfw T352784
  • 15:27 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr[1-2]-codfw,cr[1-2]-codfw IPv6,lvs2013 with reason: moving lvs hosts codfw T352784
  • 15:19 topranks: Disabling puppet and PyBal on lvs2013 ahead of migration of network link to ssw1-a1-codfw T352784
  • 15:18 Dreamy_Jazz: T351400 running mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 25 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-20.txt
  • 15:18 Dreamy_Jazz: Stopped mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 20 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-20.txt
  • 15:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P54732 and previous config saved to /var/cache/conftool/dbconfig/20240116-151344-marostegui.json
  • 15:13 Dreamy_Jazz: T351400 running mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 20 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-20.txt
  • 15:11 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:07 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:00 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:00 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for cloud-support1-c-eqiad - cmooney@cumin1002"
  • 14:58 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for cloud-support1-c-eqiad - cmooney@cumin1002"
  • 14:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T354336)', diff saved to https://phabricator.wikimedia.org/P54731 and previous config saved to /var/cache/conftool/dbconfig/20240116-145837-marostegui.json
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T354336)', diff saved to https://phabricator.wikimedia.org/P54730 and previous config saved to /var/cache/conftool/dbconfig/20240116-145613-marostegui.json
  • 14:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 14:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 14:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 14:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 14:55 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 14:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
  • 14:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54729 and previous config saved to /var/cache/conftool/dbconfig/20240116-145458-marostegui.json
  • 14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P54728 and previous config saved to /var/cache/conftool/dbconfig/20240116-143951-marostegui.json
  • 14:33 moritzm: installing ca-certificates-java bugfix updates on bookworm
  • 14:31 Dreamy_Jazz: UTC afternoon deploys done
  • 14:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P54727 and previous config saved to /var/cache/conftool/dbconfig/20240116-142444-marostegui.json
  • 14:24 dreamyjazz@deploy2002: Finished scap: Backport for Add more statsd counters and add logstash logging (T351419) (duration: 07m 15s)
  • 14:18 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 14:18 dreamyjazz@deploy2002: dreamyjazz: Backport for Add more statsd counters and add logstash logging (T351419) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:17 moritzm: installing 5.10.205 kernels on buster hosts running the 5.10 backport
  • 14:16 dreamyjazz@deploy2002: Started scap: Backport for Add more statsd counters and add logstash logging (T351419)
  • 14:14 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2042.codfw.wmnet
  • 14:14 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1041.eqiad.wmnet
  • 14:11 dreamyjazz@deploy2002: Finished scap: Backport for Support parallel PhotoDNA requests (T354408) (duration: 07m 14s)
  • 14:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54726 and previous config saved to /var/cache/conftool/dbconfig/20240116-140938-marostegui.json
  • 14:07 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2042.codfw.wmnet
  • 14:07 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1041.eqiad.wmnet
  • 14:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1144:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54725 and previous config saved to /var/cache/conftool/dbconfig/20240116-140713-marostegui.json
  • 14:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 14:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 14:05 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 14:05 dreamyjazz@deploy2002: dreamyjazz: Backport for Support parallel PhotoDNA requests (T354408) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:04 dreamyjazz@deploy2002: Started scap: Backport for Support parallel PhotoDNA requests (T354408)
  • 13:54 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:35 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS bullseye
  • 13:18 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
  • 13:15 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
  • 13:09 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:09 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:08 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:08 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:06 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:05 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:02 effie: reimage mc-wf1001 (part of puppet7 migration)
  • 13:01 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS bullseye
  • 12:57 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1040.eqiad.wmnet
  • 12:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2041.codfw.wmnet
  • 12:52 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1040.eqiad.wmnet
  • 12:50 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2041.codfw.wmnet
  • 12:30 moritzm: installing systemd bugfix updates from Bullseye point release
  • 12:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc-wf1001.eqiad.wmnet
  • 12:18 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2040.codfw.wmnet
  • 12:11 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2040.codfw.wmnet
  • 12:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc-wf1001.eqiad.wmnet
  • 11:56 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.14 refs T354432
  • 11:45 jnuche@deploy2002: Finished scap: Backport for PreAuthenticationProvider: Deny account creation based on ipoid data (T354928) (duration: 29m 32s)
  • 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2041.codfw.wmnet
  • 11:39 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2041.codfw.wmnet
  • 11:36 jnuche@deploy2002: jnuche and kharlan: Continuing with sync
  • 11:36 jnuche@deploy2002: jnuche and kharlan: Backport for PreAuthenticationProvider: Deny account creation based on ipoid data (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2040.codfw.wmnet
  • 11:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2040.codfw.wmnet
  • 11:23 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1041.eqiad.wmnet
  • 11:19 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2039.codfw.wmnet
  • 11:16 jnuche@deploy2002: Started scap: Backport for PreAuthenticationProvider: Deny account creation based on ipoid data (T354928)
  • 11:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1041.eqiad.wmnet
  • 11:13 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2039.codfw.wmnet
  • 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1040.eqiad.wmnet
  • 11:08 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1040.eqiad.wmnet
  • 10:59 jnuche@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.14 refs T354432 (duration: 29m 36s)
  • 10:53 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1039.eqiad.wmnet
  • 10:47 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1039.eqiad.wmnet
  • 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2039.codfw.wmnet
  • 10:35 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2038.codfw.wmnet
  • 10:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2039.codfw.wmnet
  • 10:30 jnuche@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.14 refs T354432
  • 10:29 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2038.codfw.wmnet
  • 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2038.codfw.wmnet
  • 10:21 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1038.eqiad.wmnet
  • 10:16 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2038.codfw.wmnet
  • 10:16 godog: clean up also 1.42.0-wmf.9 1.42.0-wmf.10 1.42.0-wmf.12 from mw22* - T355117
  • 10:15 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1038.eqiad.wmnet
  • 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1039.eqiad.wmnet
  • 10:10 godog: manually pruning php-1.42.0-wmf.7 from mw22* - T355117
  • 10:07 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1039.eqiad.wmnet
  • 10:06 jnuche@deploy2002: Pruned MediaWiki: 1.42.0-wmf.7, 1.42.0-wmf.9, 1.42.0-wmf.10, 1.42.0-wmf.12 (duration: 07m 08s)
  • 10:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1038.eqiad.wmnet
  • 10:00 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1038.eqiad.wmnet
  • 09:51 jnuche@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.14 refs T354432 (duration: 52m 52s)
  • 09:28 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set cloudvirt2004-dev as active - taavi@cumin1002"
  • 09:26 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set cloudvirt2004-dev as active - taavi@cumin1002"
  • 09:25 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:23 taavi@cumin1002: START - Cookbook sre.dns.netbox
  • 09:05 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Daniram3 out of all services on: 2211 hosts
  • 09:04 denisse: reprepro: Copy grafana v9.4.14 from buster to bookworm - T352665
  • 09:03 denisse: reprepro: Copy grafana v9.4.14 from buster to bookworm
  • 09:03 root@cumin2002: START - Cookbook sre.idm.logout Logging Daniram3 out of all services on: 2211 hosts
  • 08:59 jnuche@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.14 refs T354432

2024-01-15

  • 21:46 reedy@deploy2002: Synchronized wmf-config/: Fix more stringified class names (duration: 06m 29s)
  • 21:37 fab@deploy2002: Finished deploy [airflow-dags/research@9b6a69a]: (no justification provided) (duration: 00m 27s)
  • 21:37 reedy@deploy2002: Synchronized wmf-config/InitialiseSettings.php: Swap stringified class names in ConfirmEdit usages (duration: 06m 30s)
  • 21:36 fab@deploy2002: Started deploy [airflow-dags/research@9b6a69a]: (no justification provided)
  • 21:23 tgr: UTC late deploys done
  • 21:22 tgr@deploy2002: Finished scap: Backport for Log emails in production (duration: 09m 11s)
  • 21:15 tgr@deploy2002: tgr: Continuing with sync
  • 21:14 tgr@deploy2002: tgr: Backport for Log emails in production synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:12 tgr@deploy2002: Started scap: Backport for Log emails in production
  • 19:23 tzatziki: creating the u4c2024_edits table on all wikis
  • 17:55 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons.
  • 17:48 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons.
  • 17:23 btullis@cumin1002: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 17:02 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid jvm daemons.
  • 17:00 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons.
  • 16:51 btullis@cumin1002: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 16:45 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbstore1005.eqiad.wmnet
  • 16:45 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:45 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 15:26 hnowlan: depooled jobrunner mw1460 to repurpose as k8s node
  • 15:06 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 15:03 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 14:59 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 14:47 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbstore1005.eqiad.wmnet
  • 14:38 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for cawiki: update wgAutoConfirmAge and wgAutoConfirmCount (T354425) (duration: 11m 36s)
  • 14:28 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:28 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 14:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Continuing with sync
  • 14:26 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 14:25 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 14:24 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 14:24 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Backport for cawiki: update wgAutoConfirmAge and wgAutoConfirmCount (T354425) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:23 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 14:23 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 14:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for cawiki: update wgAutoConfirmAge and wgAutoConfirmCount (T354425)
  • 13:49 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 13:26 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp2003.codfw.wmnet
  • 13:19 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp2003.codfw.wmnet
  • 13:19 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp2002.codfw.wmnet
  • 13:12 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp2002.codfw.wmnet
  • 13:12 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp2001.codfw.wmnet
  • 13:09 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1003.eqiad.wmnet
  • 13:05 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp2001.codfw.wmnet
  • 13:03 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp1003.eqiad.wmnet
  • 13:00 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbstore1003.eqiad.wmnet
  • 13:00 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:00 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: mediawiki::memcached::gutter
  • 12:59 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 12:54 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: mediawiki::memcached::gutter
  • 12:42 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 12:39 effie: enable puppet on mc* hosts - - T349619
  • 12:37 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbstore1003.eqiad.wmnet
  • 12:23 effie: stopping puppet on all mediawiki memcached hosts (mc*, mc-gp*), puppet 7 migration in progress - T349619
  • 12:01 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 92 hosts
  • 12:00 btullis@cumin1002: START - Cookbook sre.hosts.remove-downtime for 92 hosts
  • 11:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 11:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 11:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-coord[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
  • 11:10 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-coord[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
  • 11:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-master[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
  • 11:10 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-master[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
  • 11:09 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1037.eqiad.wmnet
  • 11:08 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 8 hosts with reason: Bringing new nameservers into service
  • 11:08 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 8 hosts with reason: Bringing new nameservers into service
  • 11:08 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 97 hosts with reason: Bringing new nameservers into service
  • 11:07 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 97 hosts with reason: Bringing new nameservers into service
  • 11:03 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1037.eqiad.wmnet
  • 10:58 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1002.eqiad.wmnet
  • 10:51 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp1002.eqiad.wmnet
  • 10:48 moritzm: installing systemd bugfix updates from Bullseye point release
  • 10:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1037.eqiad.wmnet
  • 10:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1037.eqiad.wmnet
  • 10:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc-gp1002.eqiad.wmnet
  • 10:02 ladsgroup@deploy2002: Finished scap: Backport for SecurePoll: Adding updated voterlist files (T349263) (duration: 16m 04s)
  • 09:58 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc-gp1002.eqiad.wmnet
  • 09:56 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 09:48 ladsgroup@deploy2002: ladsgroup: Backport for SecurePoll: Adding updated voterlist files (T349263) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:46 ladsgroup@deploy2002: Started scap: Backport for SecurePoll: Adding updated voterlist files (T349263)
  • 09:16 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:16 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:15 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:15 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:15 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:14 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:45 filippo@deploy2002: Finished deploy [performance/arc-lamp@67389a0]: (no justification provided) (duration: 00m 05s)
  • 08:45 filippo@deploy2002: Started deploy [performance/arc-lamp@67389a0]: (no justification provided)
  • 08:23 dcausse@deploy2002: Finished scap: Backport for enable page_rerender for 5th batch of wikis (T351503) (duration: 11m 40s)
  • 08:17 dcausse@deploy2002: pfischer and dcausse: Continuing with sync
  • 08:13 dcausse@deploy2002: pfischer and dcausse: Backport for enable page_rerender for 5th batch of wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:12 dcausse@deploy2002: Started scap: Backport for enable page_rerender for 5th batch of wikis (T351503)
  • 04:57 andrewbogott: restarting wikitech-static, oom

2024-01-14

2024-01-12

  • 23:49 dzahn@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Conniecc1 out of all services on: 2213 hosts
  • 23:47 dzahn@cumin1001: START - Cookbook sre.idm.logout Logging Conniecc1 out of all services on: 2213 hosts
  • 22:52 dzahn@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Conniecc1 out of all services on: 2213 hosts
  • 22:51 dzahn@cumin1001: START - Cookbook sre.idm.logout Logging Conniecc1 out of all services on: 2213 hosts
  • 22:29 dzahn@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Conniecc1 out of all services on: 2213 hosts
  • 22:28 dzahn@cumin1001: START - Cookbook sre.idm.logout Logging Conniecc1 out of all services on: 2213 hosts
  • 18:07 mutante: aphlict1002 - systemctl start logrotate
  • 17:18 tchanders@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 17:18 tchanders@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 17:17 tchanders@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 17:16 tchanders@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 17:10 tchanders@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 17:09 tchanders@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 16:52 cgoubert@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 16:52 cgoubert@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
  • 16:51 cgoubert@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 16:51 cgoubert@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 16:20 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 16:20 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 16:20 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 16:19 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 15:46 klausman@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 15:37 klausman@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 15:14 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 15:14 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 14:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 14:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 14:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T354336)', diff saved to https://phabricator.wikimedia.org/P54714 and previous config saved to /var/cache/conftool/dbconfig/20240112-140423-marostegui.json
  • 13:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P54713 and previous config saved to /var/cache/conftool/dbconfig/20240112-134916-marostegui.json
  • 13:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P54712 and previous config saved to /var/cache/conftool/dbconfig/20240112-133410-marostegui.json
  • 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T354336)', diff saved to https://phabricator.wikimedia.org/P54711 and previous config saved to /var/cache/conftool/dbconfig/20240112-131904-marostegui.json
  • 12:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T354336)', diff saved to https://phabricator.wikimedia.org/P54710 and previous config saved to /var/cache/conftool/dbconfig/20240112-125944-marostegui.json
  • 12:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 12:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 12:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54709 and previous config saved to /var/cache/conftool/dbconfig/20240112-125921-marostegui.json
  • 12:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P54708 and previous config saved to /var/cache/conftool/dbconfig/20240112-124416-marostegui.json
  • 12:33 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=dewiki --logwiki=metawiki 'Osip Knecht' 'Artquichotte39'
  • 12:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P54707 and previous config saved to /var/cache/conftool/dbconfig/20240112-122909-marostegui.json
  • 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54706 and previous config saved to /var/cache/conftool/dbconfig/20240112-121402-marostegui.json
  • 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54704 and previous config saved to /var/cache/conftool/dbconfig/20240112-121150-marostegui.json
  • 12:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 12:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54703 and previous config saved to /var/cache/conftool/dbconfig/20240112-121127-marostegui.json
  • 12:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 12:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 12:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 12:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 12:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 12:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 11:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P54701 and previous config saved to /var/cache/conftool/dbconfig/20240112-115621-marostegui.json
  • 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P54700 and previous config saved to /var/cache/conftool/dbconfig/20240112-114114-marostegui.json
  • 11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54699 and previous config saved to /var/cache/conftool/dbconfig/20240112-112608-marostegui.json
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54698 and previous config saved to /var/cache/conftool/dbconfig/20240112-112049-marostegui.json
  • 11:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54697 and previous config saved to /var/cache/conftool/dbconfig/20240112-112027-marostegui.json
  • 11:10 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 11:08 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P54696 and previous config saved to /var/cache/conftool/dbconfig/20240112-110521-marostegui.json
  • 11:04 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P54695 and previous config saved to /var/cache/conftool/dbconfig/20240112-105014-marostegui.json
  • 10:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54694 and previous config saved to /var/cache/conftool/dbconfig/20240112-103508-marostegui.json
  • 10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2169:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54693 and previous config saved to /var/cache/conftool/dbconfig/20240112-103250-marostegui.json
  • 10:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 10:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54692 and previous config saved to /var/cache/conftool/dbconfig/20240112-103227-marostegui.json
  • 10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P54691 and previous config saved to /var/cache/conftool/dbconfig/20240112-101721-marostegui.json
  • 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P54690 and previous config saved to /var/cache/conftool/dbconfig/20240112-100214-marostegui.json
  • 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54689 and previous config saved to /var/cache/conftool/dbconfig/20240112-094708-marostegui.json
  • 09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54688 and previous config saved to /var/cache/conftool/dbconfig/20240112-094451-marostegui.json
  • 09:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 09:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T354336)', diff saved to https://phabricator.wikimedia.org/P54687 and previous config saved to /var/cache/conftool/dbconfig/20240112-094413-marostegui.json
  • 09:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P54686 and previous config saved to /var/cache/conftool/dbconfig/20240112-092907-marostegui.json
  • 09:25 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 09:25 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 09:17 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 09:16 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 09:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P54685 and previous config saved to /var/cache/conftool/dbconfig/20240112-091400-marostegui.json
  • 09:09 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 08:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T354336)', diff saved to https://phabricator.wikimedia.org/P54684 and previous config saved to /var/cache/conftool/dbconfig/20240112-085854-marostegui.json
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T354336)', diff saved to https://phabricator.wikimedia.org/P54683 and previous config saved to /var/cache/conftool/dbconfig/20240112-085637-marostegui.json
  • 08:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 08:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T354336)', diff saved to https://phabricator.wikimedia.org/P54682 and previous config saved to /var/cache/conftool/dbconfig/20240112-085614-marostegui.json
  • 08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P54681 and previous config saved to /var/cache/conftool/dbconfig/20240112-084108-marostegui.json
  • 08:40 godog: upload and finish upgrade of prometheus 2.48 on all sites - T354399
  • 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54680 and previous config saved to /var/cache/conftool/dbconfig/20240112-083837-root.json
  • 08:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P54679 and previous config saved to /var/cache/conftool/dbconfig/20240112-082601-marostegui.json
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54678 and previous config saved to /var/cache/conftool/dbconfig/20240112-082332-root.json
  • 08:20 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 3605
  • 08:19 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 3605
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T354336)', diff saved to https://phabricator.wikimedia.org/P54677 and previous config saved to /var/cache/conftool/dbconfig/20240112-081055-marostegui.json
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2129 (T354336)', diff saved to https://phabricator.wikimedia.org/P54676 and previous config saved to /var/cache/conftool/dbconfig/20240112-080837-marostegui.json
  • 08:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54675 and previous config saved to /var/cache/conftool/dbconfig/20240112-080827-root.json
  • 08:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T354336)', diff saved to https://phabricator.wikimedia.org/P54674 and previous config saved to /var/cache/conftool/dbconfig/20240112-080815-marostegui.json
  • 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54673 and previous config saved to /var/cache/conftool/dbconfig/20240112-075322-root.json
  • 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P54672 and previous config saved to /var/cache/conftool/dbconfig/20240112-075309-marostegui.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54671 and previous config saved to /var/cache/conftool/dbconfig/20240112-073817-root.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P54670 and previous config saved to /var/cache/conftool/dbconfig/20240112-073802-marostegui.json
  • 07:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54669 and previous config saved to /var/cache/conftool/dbconfig/20240112-072312-root.json
  • 07:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T354336)', diff saved to https://phabricator.wikimedia.org/P54668 and previous config saved to /var/cache/conftool/dbconfig/20240112-072255-marostegui.json
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T354336)', diff saved to https://phabricator.wikimedia.org/P54667 and previous config saved to /var/cache/conftool/dbconfig/20240112-072038-marostegui.json
  • 07:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 07:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T354336)', diff saved to https://phabricator.wikimedia.org/P54666 and previous config saved to /var/cache/conftool/dbconfig/20240112-072015-marostegui.json
  • 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54665 and previous config saved to /var/cache/conftool/dbconfig/20240112-070807-root.json
  • 07:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P54664 and previous config saved to /var/cache/conftool/dbconfig/20240112-070508-marostegui.json
  • 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1168.eqiad.wmnet with OS bookworm
  • 06:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P54663 and previous config saved to /var/cache/conftool/dbconfig/20240112-065002-marostegui.json
  • 06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
  • 06:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
  • 06:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T354336)', diff saved to https://phabricator.wikimedia.org/P54662 and previous config saved to /var/cache/conftool/dbconfig/20240112-063456-marostegui.json
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2117 (T354336)', diff saved to https://phabricator.wikimedia.org/P54661 and previous config saved to /var/cache/conftool/dbconfig/20240112-063239-marostegui.json
  • 06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 06:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 06:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 06:23 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1168.eqiad.wmnet with OS bookworm
  • 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1168 T354506', diff saved to https://phabricator.wikimedia.org/P54660 and previous config saved to /var/cache/conftool/dbconfig/20240112-062137-marostegui.json
  • 06:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 06:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 04:12 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 04:12 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 04:12 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 04:11 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 04:11 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 04:11 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 00:59 mutante: LDAP - added myself to gerritadmin group

2024-01-11

  • 21:36 jan_drewniak: https://phabricator.wikimedia.org/T349337#9454773 running maintenance script to delete unnecessary user preferences.
  • 21:26 jdrewniak@deploy2002: Finished scap: Backport for InitialiseSettings.php: disallow obsolete HTML in signatures (enwiki) (T354013), InitialiseSettings.php: Allow thanking bots (T341388) (duration: 13m 43s)
  • 21:20 jdrewniak@deploy2002: jdrewniak and houseblaster: Continuing with sync
  • 21:14 jdrewniak@deploy2002: jdrewniak and houseblaster: Backport for InitialiseSettings.php: disallow obsolete HTML in signatures (enwiki) (T354013), InitialiseSettings.php: Allow thanking bots (T341388) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:12 jdrewniak@deploy2002: Started scap: Backport for InitialiseSettings.php: disallow obsolete HTML in signatures (enwiki) (T354013), InitialiseSettings.php: Allow thanking bots (T341388)
  • 20:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 20:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 20:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T354336)', diff saved to https://phabricator.wikimedia.org/P54657 and previous config saved to /var/cache/conftool/dbconfig/20240111-205021-marostegui.json
  • 20:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P54656 and previous config saved to /var/cache/conftool/dbconfig/20240111-203514-marostegui.json
  • 20:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P54655 and previous config saved to /var/cache/conftool/dbconfig/20240111-202008-marostegui.json
  • 20:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T354336)', diff saved to https://phabricator.wikimedia.org/P54654 and previous config saved to /var/cache/conftool/dbconfig/20240111-200502-marostegui.json
  • 20:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T354336)', diff saved to https://phabricator.wikimedia.org/P54653 and previous config saved to /var/cache/conftool/dbconfig/20240111-200253-marostegui.json
  • 20:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 20:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 20:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 20:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 20:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T354336)', diff saved to https://phabricator.wikimedia.org/P54652 and previous config saved to /var/cache/conftool/dbconfig/20240111-200209-marostegui.json
  • 20:00 htriedman@deploy2002: Finished deploy [airflow-dags/platform_eng@07f5320]: (no justification provided) (duration: 00m 27s)
  • 20:00 htriedman@deploy2002: Started deploy [airflow-dags/platform_eng@07f5320]: (no justification provided)
  • 19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P54651 and previous config saved to /var/cache/conftool/dbconfig/20240111-194703-marostegui.json
  • 19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P54649 and previous config saved to /var/cache/conftool/dbconfig/20240111-193156-marostegui.json
  • 19:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T354336)', diff saved to https://phabricator.wikimedia.org/P54647 and previous config saved to /var/cache/conftool/dbconfig/20240111-191650-marostegui.json
  • 19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T354336)', diff saved to https://phabricator.wikimedia.org/P54646 and previous config saved to /var/cache/conftool/dbconfig/20240111-191440-marostegui.json
  • 19:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 19:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54645 and previous config saved to /var/cache/conftool/dbconfig/20240111-191418-marostegui.json
  • 19:11 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.13 refs T350089
  • 19:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P54644 and previous config saved to /var/cache/conftool/dbconfig/20240111-185912-marostegui.json
  • 18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P54643 and previous config saved to /var/cache/conftool/dbconfig/20240111-184405-marostegui.json
  • 18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54641 and previous config saved to /var/cache/conftool/dbconfig/20240111-182859-marostegui.json
  • 18:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54640 and previous config saved to /var/cache/conftool/dbconfig/20240111-182745-marostegui.json
  • 18:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 18:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 18:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T354336)', diff saved to https://phabricator.wikimedia.org/P54639 and previous config saved to /var/cache/conftool/dbconfig/20240111-182723-marostegui.json
  • 18:27 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit primary: gerrit.wikimedia.org) (duration: 00m 07s)
  • 18:27 thcipriani@deploy2002: Started deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit primary: gerrit.wikimedia.org)
  • 18:25 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit2002 only) (duration: 00m 05s)
  • 18:25 thcipriani@deploy2002: Started deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit2002 only)
  • 18:23 thcipriani: deploying gerrit to remove devsat survey (no restart needed)
  • 18:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P54638 and previous config saved to /var/cache/conftool/dbconfig/20240111-181217-marostegui.json
  • 17:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P54637 and previous config saved to /var/cache/conftool/dbconfig/20240111-175710-marostegui.json
  • 17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T354336)', diff saved to https://phabricator.wikimedia.org/P54636 and previous config saved to /var/cache/conftool/dbconfig/20240111-174204-marostegui.json
  • 17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T354336)', diff saved to https://phabricator.wikimedia.org/P54635 and previous config saved to /var/cache/conftool/dbconfig/20240111-173955-marostegui.json
  • 17:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 17:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T354336)', diff saved to https://phabricator.wikimedia.org/P54634 and previous config saved to /var/cache/conftool/dbconfig/20240111-173933-marostegui.json
  • 17:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P54633 and previous config saved to /var/cache/conftool/dbconfig/20240111-172427-marostegui.json
  • 17:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P54632 and previous config saved to /var/cache/conftool/dbconfig/20240111-170920-marostegui.json
  • 16:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T354336)', diff saved to https://phabricator.wikimedia.org/P54631 and previous config saved to /var/cache/conftool/dbconfig/20240111-165414-marostegui.json
  • 16:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T354336)', diff saved to https://phabricator.wikimedia.org/P54630 and previous config saved to /var/cache/conftool/dbconfig/20240111-165305-marostegui.json
  • 16:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 16:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 16:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54629 and previous config saved to /var/cache/conftool/dbconfig/20240111-165244-marostegui.json
  • 16:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P54628 and previous config saved to /var/cache/conftool/dbconfig/20240111-163738-marostegui.json
  • 16:23 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P54626 and previous config saved to /var/cache/conftool/dbconfig/20240111-162231-marostegui.json
  • 16:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54625 and previous config saved to /var/cache/conftool/dbconfig/20240111-160725-marostegui.json
  • 16:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: cache::upload
  • 16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54624 and previous config saved to /var/cache/conftool/dbconfig/20240111-160516-marostegui.json
  • 16:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 16:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 16:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T354336)', diff saved to https://phabricator.wikimedia.org/P54623 and previous config saved to /var/cache/conftool/dbconfig/20240111-160454-marostegui.json
  • 15:59 sukhe: restart pybal on lvs4010
  • 15:58 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe
  • 15:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P54622 and previous config saved to /var/cache/conftool/dbconfig/20240111-154947-marostegui.json
  • 15:47 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe
  • 15:41 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: cache::upload
  • 15:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P54621 and previous config saved to /var/cache/conftool/dbconfig/20240111-153441-marostegui.json
  • 15:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T354336)', diff saved to https://phabricator.wikimedia.org/P54620 and previous config saved to /var/cache/conftool/dbconfig/20240111-151934-marostegui.json
  • 15:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T354336)', diff saved to https://phabricator.wikimedia.org/P54619 and previous config saved to /var/cache/conftool/dbconfig/20240111-151724-marostegui.json
  • 15:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 15:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 15:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T354336)', diff saved to https://phabricator.wikimedia.org/P54618 and previous config saved to /var/cache/conftool/dbconfig/20240111-151702-marostegui.json
  • 15:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P54617 and previous config saved to /var/cache/conftool/dbconfig/20240111-150156-marostegui.json
  • 14:51 reedy@deploy2002: Synchronized wmf-config/: T325147 (duration: 06m 43s)
  • 14:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P54616 and previous config saved to /var/cache/conftool/dbconfig/20240111-144649-marostegui.json
  • 14:36 reedy@deploy2002: Synchronized wmf-config/: T344398 (duration: 07m 25s)
  • 14:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T354336)', diff saved to https://phabricator.wikimedia.org/P54615 and previous config saved to /var/cache/conftool/dbconfig/20240111-143143-marostegui.json
  • 14:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T354336)', diff saved to https://phabricator.wikimedia.org/P54614 and previous config saved to /var/cache/conftool/dbconfig/20240111-143034-marostegui.json
  • 14:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 14:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 14:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 14:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 14:26 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 14:25 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 14:25 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 14:25 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 14:24 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 14:24 kamila@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 14:21 reedy@deploy2002: Synchronized wmf-config/InitialiseSettings.php: T205347 (duration: 07m 41s)
  • 14:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54613 and previous config saved to /var/cache/conftool/dbconfig/20240111-141058-root.json
  • 13:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54612 and previous config saved to /var/cache/conftool/dbconfig/20240111-135553-root.json
  • 13:49 hashar@deploy2002: Finished deploy [gerrit/gerrit@af34477]: wm-zuul-status: add SCHEDULED for pending check run - T348959 (duration: 00m 07s)
  • 13:49 hashar@deploy2002: Started deploy [gerrit/gerrit@af34477]: wm-zuul-status: add SCHEDULED for pending check run - T348959
  • 13:41 moritzm: installing xerces-c security updates
  • 13:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54611 and previous config saved to /var/cache/conftool/dbconfig/20240111-134048-root.json
  • 13:29 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:29 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54610 and previous config saved to /var/cache/conftool/dbconfig/20240111-132543-root.json
  • 13:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54609 and previous config saved to /var/cache/conftool/dbconfig/20240111-131038-root.json
  • 12:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54608 and previous config saved to /var/cache/conftool/dbconfig/20240111-125533-root.json
  • 12:47 hashar: Restarting Gerrit to apply config change https://gerrit.wikimedia.org/r/c/operations/puppet/+/989735/ # T206049
  • 12:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54607 and previous config saved to /var/cache/conftool/dbconfig/20240111-124028-root.json
  • 12:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2124.codfw.wmnet with OS bookworm
  • 12:20 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:20 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2124.codfw.wmnet with reason: host reimage
  • 12:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2124.codfw.wmnet with reason: host reimage
  • 12:00 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 12:00 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 11:59 moritzm: installing Python 2.7 security updates on Bullseye
  • 11:50 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2124.codfw.wmnet with OS bookworm
  • 11:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2124 T354506', diff saved to https://phabricator.wikimedia.org/P54606 and previous config saved to /var/cache/conftool/dbconfig/20240111-114930-marostegui.json
  • 11:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54605 and previous config saved to /var/cache/conftool/dbconfig/20240111-111958-root.json
  • 11:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54604 and previous config saved to /var/cache/conftool/dbconfig/20240111-110453-root.json
  • 10:54 moritzm: installing Linux 5.10.205 updates on Bullseye hosts
  • 10:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54603 and previous config saved to /var/cache/conftool/dbconfig/20240111-104948-root.json
  • 10:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54602 and previous config saved to /var/cache/conftool/dbconfig/20240111-103443-root.json
  • 10:31 moritzm: installing exim4 security updates
  • 10:31 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 10:30 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 10:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: druid::public::worker
  • 10:26 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 10:26 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 10:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54601 and previous config saved to /var/cache/conftool/dbconfig/20240111-101938-root.json
  • 10:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: druid::public::worker
  • 10:12 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 10:12 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54600 and previous config saved to /var/cache/conftool/dbconfig/20240111-100433-root.json
  • 10:04 sfaci@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 10:03 sfaci@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 10:03 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 10:00 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 10:00 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 09:58 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 09:53 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 09:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54599 and previous config saved to /var/cache/conftool/dbconfig/20240111-094928-root.json
  • 09:39 hashar: Gerrit back up and operational, now running version 3.6.8
  • 09:33 hashar: Gerrit restarted and its reindexing all changes T309870
  • 09:23 hashar@deploy2002: Finished deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870 (duration: 00m 07s)
  • 09:23 hashar@deploy2002: Started deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870
  • 09:22 hashar@deploy2002: Finished deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870 (duration: 00m 27s)
  • 09:21 hashar@deploy2002: Started deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870
  • 09:21 hashar: Stopping Gerrit
  • 09:10 hashar: gerrit: `ssh -p 29418 gerrit.wikimedia.org gerrit copy-approvals` # T309870
  • 09:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1201.eqiad.wmnet with OS bookworm
  • 08:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
  • 08:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage

2024-01-10

  • 22:29 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-eqiad
  • 22:05 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-eqiad
  • 21:54 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-codfw
  • 21:36 Dreamy_Jazz: UTC late deploys done
  • 21:33 dreamyjazz@deploy2002: Finished scap: Backport for Add comment to clarify which rate limits apply to temporary users (T331576) (duration: 08m 05s)
  • 21:28 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-codfw
  • 21:27 dreamyjazz@deploy2002: dreamyjazz and tchanders: Continuing with sync
  • 21:27 dreamyjazz@deploy2002: dreamyjazz and tchanders: Backport for Add comment to clarify which rate limits apply to temporary users (T331576) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:25 dreamyjazz@deploy2002: Started scap: Backport for Add comment to clarify which rate limits apply to temporary users (T331576)
  • 21:19 taavi@deploy2002: Finished scap: Backport for Disable max width for index namespace (T352162) (duration: 14m 19s)
  • 21:12 taavi@deploy2002: toyofuku and taavi: Continuing with sync
  • 21:08 taavi@deploy2002: toyofuku and taavi: Backport for Disable max width for index namespace (T352162) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:05 taavi@deploy2002: Started scap: Backport for Disable max width for index namespace (T352162)
  • 20:22 sukhe: enable puppet on lvs2013: T352758
  • 19:29 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:29 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for mr1-codfw core links - cmooney@cumin1002"
  • 19:28 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for mr1-codfw core links - cmooney@cumin1002"
  • 19:26 jhuneidi@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.13 refs T350089 (duration: 07m 58s)
  • 19:24 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 19:18 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.13 refs T350089
  • 19:00 topranks: disabling OSPF connection from mr1-codfw to codfw core routers T348164
  • 18:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 18:38 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus2006.codfw.wmnet with reason: memory upgrade
  • 18:37 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus2006.codfw.wmnet with reason: memory upgrade
  • 18:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 18:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 18:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 18:35 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for prometheus2005.codfw.wmnet
  • 18:35 filippo@cumin1002: START - Cookbook sre.hosts.remove-downtime for prometheus2005.codfw.wmnet
  • 18:24 sukhe: stop pybal on lvs2013: T352758
  • 17:59 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on prometheus2005.codfw.wmnet with reason: memory upgrade
  • 17:58 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on prometheus2005.codfw.wmnet with reason: memory upgrade
  • 17:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
  • 17:47 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:46 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:44 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:44 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:40 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs2014.codfw.wmnet
  • 17:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
  • 17:31 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
  • 17:28 sukhe@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
  • 17:27 sukhe: enable puppet on lvs2014: T352758
  • 17:16 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
  • 17:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1378.eqiad.wmnet with OS bullseye
  • 17:14 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:14 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns for sandbox1-a-codfw irb.2201 gw - cmooney@cumin1002"
  • 17:14 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns for sandbox1-a-codfw irb.2201 gw - cmooney@cumin1002"
  • 17:09 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 16:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
  • 16:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
  • 16:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye
  • 16:36 godog: upgrade prometheus on prometheus2006 - T354399
  • 16:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 16:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 16:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 16:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 16:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 16:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 16:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw[1379-1383].eqiad.wmnet with reason: testing reboot
  • 16:25 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw[1379-1383].eqiad.wmnet with reason: testing reboot
  • 16:22 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1379.eqiad.wmnet with OS bullseye
  • 16:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 16:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
  • 16:00 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1383.eqiad.wmnet with OS bullseye
  • 15:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1381.eqiad.wmnet with OS bullseye
  • 15:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1382.eqiad.wmnet with OS bullseye
  • 15:57 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
  • 15:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: logging::opensearch::data
  • 15:41 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
  • 15:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
  • 15:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
  • 15:37 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
  • 15:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
  • 15:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
  • 15:34 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
  • 15:24 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: logging::opensearch::data
  • 15:24 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
  • 15:22 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
  • 15:21 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
  • 15:21 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1383.eqiad.wmnet with OS bullseye
  • 15:20 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1382.eqiad.wmnet with OS bullseye
  • 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: logging::opensearch::collector
  • 15:19 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1381.eqiad.wmnet with OS bullseye
  • 15:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1380.eqiad.wmnet with OS bullseye
  • 15:14 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
  • 15:13 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-master[1003-1004].eqiad.wmnet with reason: Bringing new nameservers into service
  • 15:13 klausman@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-staging2001.codfw.wmnet
  • 15:12 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-master[1003-1004].eqiad.wmnet with reason: Bringing new nameservers into service
  • 15:07 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbproxy[1018-1019].eqiad.wmnet
  • 15:06 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:06 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbproxy[1018-1019].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
  • 15:04 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on lvs2014.codfw.wmnet with reason: T352758
  • 15:04 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on lvs2014.codfw.wmnet with reason: T352758
  • 15:03 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbproxy[1018-1019].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
  • 15:01 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
  • 15:01 sukhe: disable puppet and stop pybal on lvs2014: T352758
  • 15:00 taavi@cumin1002: START - Cookbook sre.dns.netbox
  • 14:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
  • 14:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: logging::opensearch::collector
  • 14:54 topranks: adding vlans to ssw1-a8-codfw to trunk to lvs2014 T352758
  • 14:52 taavi@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbproxy[1018-1019].eqiad.wmnet
  • 14:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
  • 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: lvs::balancer
  • 14:39 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
  • 14:39 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
  • 14:38 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1380.eqiad.wmnet with OS bullseye
  • 14:27 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: lvs::balancer
  • 14:27 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 14:27 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
  • 14:26 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 14:26 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
  • 14:25 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 14:24 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 14:22 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 14:21 moritzm: installing lapack bugfix updates
  • 14:21 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 14:04 moritzm: installing openblas bugfix updates
  • 14:03 hashar: Switching operations-puppet-tests-buster-docker Jenkins job from tox v3 to tox v4 | T345152
  • 13:56 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:56 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:54 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:54 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:15 godog: test prometheus 2.48.1 on prometheus1005 - T354399
  • 12:48 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.roll-restart-workers (exit_code=99) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 12:47 stevemunene@cumin1002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 12:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts druid1006.eqiad.wmnet
  • 12:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1006.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
  • 12:37 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1006.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
  • 12:37 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 12:37 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
  • 12:37 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 12:37 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 12:35 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
  • 12:22 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts druid1006.eqiad.wmnet
  • 12:21 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts druid1005.eqiad.wmnet
  • 12:21 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:21 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
  • 12:20 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
  • 12:18 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
  • 12:05 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts druid1005.eqiad.wmnet
  • 11:56 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts druid1004.eqiad.wmnet
  • 11:56 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:56 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
  • 11:54 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
  • 11:51 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
  • 11:47 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 11:46 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 11:46 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 11:46 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 11:46 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 11:46 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 11:43 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 11:43 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 11:43 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 11:41 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 11:41 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 11:41 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 11:39 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts druid1004.eqiad.wmnet
  • 11:37 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 11:37 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 11:36 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 11:36 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 11:03 moritzm: installing PHP 7.3 security updates
  • 10:46 moritzm: installing curl security updates
  • 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testreduce1001.eqiad.wmnet
  • 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testreduce1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 10:02 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testreduce1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 10:01 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: sync
  • 10:00 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: sync
  • 10:00 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: sync
  • 10:00 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: sync
  • 09:57 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 09:55 hashar@deploy2002: Finished deploy [integration/docroot@355ddbb]: (no justification provided) (duration: 00m 04s)
  • 09:55 hashar@deploy2002: Started deploy [integration/docroot@355ddbb]: (no justification provided)
  • 09:55 moritzm: installing git security updates on deployment hosts
  • 09:53 hashar@deploy2002: Finished deploy [integration/docroot@355ddbb]: Dummy deploy to test git safe.directory # T335354 (duration: 00m 06s)
  • 09:53 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testreduce1001.eqiad.wmnet
  • 09:53 hashar@deploy2002: Started deploy [integration/docroot@355ddbb]: Dummy deploy to test git safe.directory # T335354
  • 09:38 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
  • 09:38 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
  • 09:38 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
  • 09:38 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
  • 09:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 15133
  • 09:00 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 15133
  • 08:59 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 13150
  • 08:57 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 13150
  • 08:47 dcausse@deploy2002: Finished scap: Backport for enable page_rerender for 4th batch of wikis (T351503) (duration: 11m 50s)
  • 08:42 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
  • 08:41 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
  • 08:41 moritzm: installing Exim security updates
  • 08:40 dcausse@deploy2002: pfischer and dcausse: Continuing with sync
  • 08:37 dcausse@deploy2002: pfischer and dcausse: Backport for enable page_rerender for 4th batch of wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:35 dcausse@deploy2002: Started scap: Backport for enable page_rerender for 4th batch of wikis (T351503)
  • 08:12 kartik@deploy2002: Finished scap: Backport for testwiki: Enable Section translation on WPs with Content Translation available as default (T351882) (duration: 09m 10s)
  • 08:06 kartik@deploy2002: kartik: Continuing with sync
  • 08:04 kartik@deploy2002: kartik: Backport for testwiki: Enable Section translation on WPs with Content Translation available as default (T351882) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:03 kartik@deploy2002: Started scap: Backport for testwiki: Enable Section translation on WPs with Content Translation available as default (T351882)
  • 07:53 moritzm: installing openjdk-8 security updates
  • 07:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2143.codfw.wmnet with OS bookworm
  • 06:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
  • 06:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
  • 06:32 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2143.codfw.wmnet with OS bookworm

2024-01-09

  • 21:23 aqu@deploy2002: Finished deploy [airflow-dags/analytics@ea53374]: Regular airflow-dags/analytics weekly train [airflow-dags@ea53374f] (duration: 00m 28s)
  • 21:22 aqu@deploy2002: Started deploy [airflow-dags/analytics@ea53374]: Regular airflow-dags/analytics weekly train [airflow-dags@ea53374f]
  • 21:21 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@ea53374]: Regular airflow-dags/analytics_test weekly train [airflow-dags@ea53374f] (duration: 00m 12s)
  • 21:21 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@ea53374]: Regular airflow-dags/analytics_test weekly train [airflow-dags@ea53374f]
  • 21:03 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c] (test number 2 after permission error) (duration: 00m 05s)
  • 21:03 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c] (test number 2 after permission error)
  • 21:02 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c] (duration: 03m 33s)
  • 20:59 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c]
  • 20:59 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56] (thin): Regular analytics weekly train THIN [analytics/refinery@c4fed56c] (duration: 00m 06s)
  • 20:58 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56] (thin): Regular analytics weekly train THIN [analytics/refinery@c4fed56c]
  • 20:58 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56]: Regular analytics weekly train [analytics/refinery@c4fed56c] (duration: 09m 06s)
  • 20:49 eevans@cumin1002: conftool action : set/weight=0; selector: cluster=restbase,dc=codfw,name=restbase2019.codfw.wmnet
  • 20:49 eevans@cumin1002: conftool action : set/weight=0; selector: cluster=restbase,dc=codfw,name=restbase2014.codfw.wmnet
  • 20:49 eevans@cumin1002: conftool action : set/weight=0; selector: cluster=restbase,dc=codfw,name=restbase2013.codfw.wmnet
  • 20:49 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56]: Regular analytics weekly train [analytics/refinery@c4fed56c]
  • 20:48 aqu: about to deploy analytics/refinery - weekly train
  • 20:40 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.13 refs T350089
  • 20:26 jhuneidi@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.13 refs T350089 (duration: 23m 33s)
  • 20:03 jhuneidi@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.13 refs T350089
  • 19:44 mutante: mwmaint1002 - rm -rf 1.42.0-wmf.7 ; mwmamint2002 - rm -rf php-1.39.0-wmf.25
  • 19:35 mutante: mwmaint1002 - rm -rf /srv/mediawiki/php-1.40.0-wmf.17
  • 19:33 mutante: mwmaint1002 - rm -rf /srv/mediawiki/php-1.39.0-wmf.25 after monitoring alerted about 99% disk usage on /srv
  • 19:26 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: all wikis to 1.42.0-wmf.12 refs T350089
  • 19:16 urandom: decommissioning cassandra, restbase2013-{a,b,c} — T352469
  • 19:14 jhuneidi@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.13 refs T350089 (duration: 45m 48s)
  • 18:42 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
  • 18:40 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
  • 18:29 jhuneidi@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.13 refs T350089
  • 18:04 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:04 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new reverse entries for mr1 -> lsw1-a2 link in codfw - cmooney@cumin1002"
  • 18:02 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new reverse entries for mr1 -> lsw1-a2 link in codfw - cmooney@cumin1002"
  • 18:00 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 17:41 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2143']
  • 17:33 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2143']
  • 17:31 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db2143']
  • 17:21 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2143']
  • 17:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti-test2004.codfw.wmnet
  • 17:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
  • 17:14 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
  • 17:12 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 17:06 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts ganeti-test2004.codfw.wmnet
  • 17:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti-test[1001-1002].eqiad.wmnet
  • 17:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
  • 17:04 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
  • 17:02 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 16:53 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts ganeti-test[1001-1002].eqiad.wmnet
  • 16:27 jayme: restart prometheus@k8s on prometheus1005 revert GOGC to 100 (default) - T354604
  • 16:22 mutante: phabricator - differential has been disabled (T330797)
  • 16:11 brennen@deploy2002: Finished deploy [phabricator/deployment@369e797]: deploy to phab1004 for T354545 (duration: 00m 56s)
  • 16:10 brennen@deploy2002: Started deploy [phabricator/deployment@369e797]: deploy to phab1004 for T354545
  • 16:10 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudrabbit1003.wikimedia.org
  • 16:10 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:10 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
  • 16:09 brennen@deploy2002: Finished deploy [phabricator/deployment@369e797]: deploy to phab2002 for T354545 (duration: 00m 55s)
  • 16:09 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
  • 16:09 mutante: phabricator deployment in progress
  • 16:08 brennen@deploy2002: Started deploy [phabricator/deployment@369e797]: deploy to phab2002 for T354545
  • 16:08 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
  • 16:08 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
  • 16:07 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: deployment
  • 16:04 taavi@cumin1002: START - Cookbook sre.dns.netbox
  • 15:58 taavi@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudrabbit1003.wikimedia.org
  • 15:54 jayme: restart prometheus@k8s on prometheus1005 with GOGC=60 - T354604
  • 15:37 akosiaris: depool and reboot mw1349 for a test T354413
  • 15:36 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp
  • 15:19 sukhe: restart pybal on lvs1019: T336043
  • 15:19 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp
  • 15:16 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 15:16 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 15:15 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 15:14 sukhe: restart pybal on lvs1020: T336043
  • 15:06 TheresNoTime: done UTC afternoon backport window
  • 15:03 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
  • 15:02 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
  • 15:02 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
  • 15:01 TheresNoTime: `[samtar@mwmaint2002 ~]$ echo 'https://en.wikipedia.org/static/images/mobile/copyright/wikinews-wordmark-zh.svg' | mwscript purgeList.php` T353792
  • 15:01 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
  • 15:00 TheresNoTime: `[samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki bjnwikiquote --add-prefix "BROKEN " --fix` T350235
  • 14:59 TheresNoTime: `[samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki zghwiki --add-prefix "BROKEN " --fix` T350241
  • 14:58 samtar@deploy2002: Finished scap: Backport for zghwiki: add metanamespace (T350241), bjnwikiquote: add metanamespace (T350235) (duration: 12m 10s)
  • 14:56 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 14:56 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 14:52 samtar@deploy2002: samtar and anzx: Continuing with sync
  • 14:50 samtar@deploy2002: samtar and anzx: Backport for zghwiki: add metanamespace (T350241), bjnwikiquote: add metanamespace (T350235) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:46 samtar@deploy2002: Started scap: Backport for zghwiki: add metanamespace (T350241), bjnwikiquote: add metanamespace (T350235)
  • 14:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2034.codfw.wmnet with OS bookworm
  • 14:44 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
  • 14:43 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp
  • 14:42 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
  • 14:38 TheresNoTime: `[samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki hewikinews --fix` T349581
  • 14:38 samtar@deploy2002: Finished scap: Backport for Create draft namespace and add namespaces aliases for hewikinews (T349581) (duration: 10m 05s)
  • 14:36 kevinbazira@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 14:35 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 14:34 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host snapshot1014.eqiad.wmnet
  • 14:32 samtar@deploy2002: samtar and anzx: Continuing with sync
  • 14:30 samtar@deploy2002: samtar and anzx: Backport for Create draft namespace and add namespaces aliases for hewikinews (T349581) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:28 samtar@deploy2002: Started scap: Backport for Create draft namespace and add namespaces aliases for hewikinews (T349581)
  • 14:27 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp
  • 14:26 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 14:26 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 14:26 TheresNoTime: deployed patch for T350739, logging bot not working?
  • 14:24 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2034.codfw.wmnet with reason: host reimage
  • 14:23 samtar@deploy2002: Finished scap: Backport for [namespaces] Use correct diacritics in Romanian (duration: 14m 42s)
  • 14:22 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams and not P{cp3066.esams.wmnet} and A:cp
  • 14:21 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2034.codfw.wmnet with reason: host reimage
  • 14:16 samtar@deploy2002: strainu and samtar: Continuing with sync
  • 14:13 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase2035.codfw.wmnet
  • 14:12 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for restbase2035.codfw.wmnet
  • 14:12 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for restbase2035.codfw.wmnet
  • 14:09 samtar@deploy2002: strainu and samtar: Backport for [namespaces] Use correct diacritics in Romanian synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:08 samtar@deploy2002: Started scap: Backport for [namespaces] Use correct diacritics in Romanian
  • 14:04 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams and not P{cp3066.esams.wmnet} and A:cp
  • 14:01 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2034.codfw.wmnet with OS bookworm
  • 14:01 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ganeti2034.codfw.wmnet with OS bookworm
  • 13:58 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2033.codfw.wmnet with OS bookworm
  • 13:58 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
  • 13:56 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
  • 13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host snapshot1014.eqiad.wmnet
  • 13:43 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host snapshot1014.eqiad.wmnet with OS bullseye
  • 13:41 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2034.codfw.wmnet with OS bookworm
  • 13:37 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2033.codfw.wmnet with reason: host reimage
  • 13:34 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2033.codfw.wmnet with reason: host reimage
  • 13:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54575 and previous config saved to /var/cache/conftool/dbconfig/20240109-133327-root.json
  • 13:20 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 13:18 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 13:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54574 and previous config saved to /var/cache/conftool/dbconfig/20240109-131822-root.json
  • 13:16 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 13:14 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2033.codfw.wmnet with OS bookworm
  • 13:13 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons.
  • 13:10 btullis@cumin1002: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 13:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54573 and previous config saved to /var/cache/conftool/dbconfig/20240109-130317-root.json
  • 13:00 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 13:00 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 12:58 stevemunene@cumin1002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.
  • 12:57 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 12:57 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
  • 12:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54572 and previous config saved to /var/cache/conftool/dbconfig/20240109-124812-root.json
  • 12:43 moritzm: imported mwbzutils 0.1.4~wmf-1+deb11u1 for bullseye-wikimedia T325228
  • 12:43 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw[1380-1382].eqiad.wmnet with reason: failed reimage waiting on fix
  • 12:42 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw[1380-1382].eqiad.wmnet with reason: failed reimage waiting on fix
  • 12:39 btullis@cumin1002: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 12:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54571 and previous config saved to /var/cache/conftool/dbconfig/20240109-123307-root.json
  • 12:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54570 and previous config saved to /var/cache/conftool/dbconfig/20240109-121802-root.json
  • 12:17 stevemunene@cumin1002: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons.
  • 12:10 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams and A:cp
  • 12:07 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:07 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove wiki replica LVS VIPs - taavi@cumin1002"
  • 12:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1180.eqiad.wmnet with OS bookworm
  • 12:06 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove wiki replica LVS VIPs - taavi@cumin1002"
  • 12:04 taavi@cumin1002: START - Cookbook sre.dns.netbox
  • 12:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54569 and previous config saved to /var/cache/conftool/dbconfig/20240109-120257-root.json
  • 12:01 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
  • 11:50 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:50 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update dns entry for kubestage2002.codfw.wmnet - cmooney@cumin1002"
  • 11:50 stevemunene@cumin1002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons.
  • 11:50 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams and A:cp
  • 11:49 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update dns entry for kubestage2002.codfw.wmnet - cmooney@cumin1002"
  • 11:46 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 11:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
  • 11:43 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
  • 11:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
  • 11:38 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp
  • 11:37 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-b8-codfw,lsw1-b8-codfw IPv6 with reason: Adding vlan to switch, precaution in case it triggers EVPN L3 bug.
  • 11:37 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
  • 11:37 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on lsw1-b8-codfw,lsw1-b8-codfw IPv6 with reason: Adding vlan to switch, precaution in case it triggers EVPN L3 bug.
  • 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
  • 11:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
  • 11:30 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1180.eqiad.wmnet with OS bookworm
  • 11:30 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=mw2394.codfw.wmnet,cluster=jobrunner
  • 11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1180 T354506', diff saved to https://phabricator.wikimedia.org/P54568 and previous config saved to /var/cache/conftool/dbconfig/20240109-112922-root.json
  • 11:22 cgoubert@cumin2002: conftool action : set/pooled=no; selector: name=mw2394.codfw.wmnet
  • 11:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1014.eqiad.wmnet with OS bullseye
  • 11:19 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:19 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:18 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:18 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:17 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:17 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:15 taavi@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3
  • 11:15 taavi@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
  • 11:14 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp
  • 11:05 moritzm: installing exim security updates
  • 10:54 godog: restart prometheus@k8s on prometheus1005 to see if labeldrop id will yield expected results - T354604
  • 10:45 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ganeti2033.codfw.wmnet with OS bookworm
  • 10:38 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
  • 10:22 sfaci@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 10:21 sfaci@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 10:19 btullis@cumin1002: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:datahubsearch
  • 10:11 btullis@cumin1002: START - Cookbook sre.opensearch.roll-restart-reboot rolling restart_daemons on A:datahubsearch
  • 10:00 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp
  • 09:59 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2033.codfw.wmnet with OS bookworm
  • 09:54 oblivian@deploy2002: Finished scap: Backport for Always process media files via shellbox on k8s (T352515) (duration: 11m 03s)
  • 09:52 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:52 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033/2034 move - ayounsi@cumin1002"
  • 09:48 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033/2034 move - ayounsi@cumin1002"
  • 09:47 oblivian@deploy2002: oblivian: Continuing with sync
  • 09:46 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 09:44 oblivian@deploy2002: oblivian: Backport for Always process media files via shellbox on k8s (T352515) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:43 oblivian@deploy2002: Started scap: Backport for Always process media files via shellbox on k8s (T352515)
  • 09:39 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp
  • 09:34 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw and A:cp
  • 09:27 oblivian@deploy2002: Finished scap: Backport for Use shellbox for djvu handling on kubernetes (T352515) (duration: 23m 56s)
  • 09:20 oblivian@deploy2002: oblivian: Continuing with sync
  • 09:15 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw and A:cp
  • 09:14 moritzm: prune obsolete nginx packages from ncredir hosts after migration to new library scheme T329529
  • 09:11 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp
  • 09:06 arnaudb: upload wmfdb 0.1.4 from https://gitlab.wikimedia.org/repos/sre/wmfdb/-/tree/dgit/bookworm-wikimedia to fix default ca bundle
  • 09:05 oblivian@deploy2002: oblivian: Backport for Use shellbox for djvu handling on kubernetes (T352515) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:03 oblivian@deploy2002: Started scap: Backport for Use shellbox for djvu handling on kubernetes (T352515)
  • 08:59 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 45287
  • 08:54 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 45287
  • 08:54 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp
  • 08:49 oblivian@deploy2002: Finished scap: Backport for Remove throttle exception (T352569) (duration: 09m 01s)
  • 08:48 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 9902
  • 08:47 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 9902
  • 08:42 oblivian@deploy2002: oblivian: Continuing with sync
  • 08:42 oblivian@deploy2002: oblivian: Backport for Remove throttle exception (T352569) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:40 oblivian@deploy2002: Started scap: Backport for Remove throttle exception (T352569)
  • 08:22 kartik@deploy2002: Finished scap: Backport for testwiki: Enable Section translation on WPs with potential to be supported with MinT using MADLAD-400 (T353510) (duration: 15m 54s)
  • 08:21 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2143.codfw.wmnet with OS bookworm
  • 08:20 godog: set aside WAL for prometheus@k8s in codfw and restart - T354399
  • 08:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54567 and previous config saved to /var/cache/conftool/dbconfig/20240109-081946-root.json
  • 08:11 kartik@deploy2002: kartik: Continuing with sync
  • 08:10 kartik@deploy2002: kartik: Backport for testwiki: Enable Section translation on WPs with potential to be supported with MinT using MADLAD-400 (T353510) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:06 kartik@deploy2002: Started scap: Backport for testwiki: Enable Section translation on WPs with potential to be supported with MinT using MADLAD-400 (T353510)
  • 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: After a crash', diff saved to https://phabricator.wikimedia.org/P54566 and previous config saved to /var/cache/conftool/dbconfig/20240109-080558-root.json
  • 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54565 and previous config saved to /var/cache/conftool/dbconfig/20240109-080441-root.json
  • 07:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: After a crash', diff saved to https://phabricator.wikimedia.org/P54564 and previous config saved to /var/cache/conftool/dbconfig/20240109-075053-root.json
  • 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54563 and previous config saved to /var/cache/conftool/dbconfig/20240109-074936-root.json
  • 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: After a crash', diff saved to https://phabricator.wikimedia.org/P54562 and previous config saved to /var/cache/conftool/dbconfig/20240109-073548-root.json
  • 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54561 and previous config saved to /var/cache/conftool/dbconfig/20240109-073431-root.json
  • 07:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: After a crash', diff saved to https://phabricator.wikimedia.org/P54560 and previous config saved to /var/cache/conftool/dbconfig/20240109-072043-root.json
  • 07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54559 and previous config saved to /var/cache/conftool/dbconfig/20240109-071926-root.json
  • 07:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: After a crash', diff saved to https://phabricator.wikimedia.org/P54558 and previous config saved to /var/cache/conftool/dbconfig/20240109-070538-root.json
  • 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54557 and previous config saved to /var/cache/conftool/dbconfig/20240109-070421-root.json
  • 07:01 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2143.codfw.wmnet with OS bookworm
  • 06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2151.codfw.wmnet with OS bookworm
  • 06:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: After a crash', diff saved to https://phabricator.wikimedia.org/P54556 and previous config saved to /var/cache/conftool/dbconfig/20240109-065033-root.json
  • 06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54555 and previous config saved to /var/cache/conftool/dbconfig/20240109-064916-root.json
  • 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: After a crash', diff saved to https://phabricator.wikimedia.org/P54554 and previous config saved to /var/cache/conftool/dbconfig/20240109-063528-root.json
  • 06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
  • 06:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
  • 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1224', diff saved to https://phabricator.wikimedia.org/P54553 and previous config saved to /var/cache/conftool/dbconfig/20240109-062806-root.json
  • 06:11 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2151.codfw.wmnet with OS bookworm
  • 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2151 T354506', diff saved to https://phabricator.wikimedia.org/P54552 and previous config saved to /var/cache/conftool/dbconfig/20240109-061015-root.json
  • 03:11 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 03:11 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 03:11 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 03:10 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 03:10 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 03:10 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 01:22 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
  • 01:17 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED

2024-01-08

  • 23:16 eileen: civicrm upgraded from 16b5417b to c7304245
  • 22:58 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
  • 22:56 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2003.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 22:30 ryankemper@puppetmaster1001: conftool action : set/weight=10:pooled=yes; selector: name=elastic2087\.codfw\.wmnet
  • 22:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host sretest2003.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 21:50 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:49 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:37 cjming: end of UTC late backport window
  • 21:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:15 cjming@deploy2002: Finished scap: Backport for Remove android.metrics_platform.* stream definitions (T354199) (duration: 08m 17s)
  • 21:08 cjming@deploy2002: cjming: Continuing with sync
  • 21:08 cjming@deploy2002: cjming: Backport for Remove android.metrics_platform.* stream definitions (T354199) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:07 cjming@deploy2002: Started scap: Backport for Remove android.metrics_platform.* stream definitions (T354199)
  • 19:30 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:28 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:27 taavi: make puppet re-generate empty envoy config file on testreduce1002 T345220
  • 19:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 19:13 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 19:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 19:09 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 19:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 19:04 sukhe: running authdns-update for CR 988684: T345220
  • 19:04 sukhe: running authdns-update for CR 988684: T336043
  • 18:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 18:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 18:34 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 18:27 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 18:25 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 18:21 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 18:19 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 18:12 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 18:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 17:56 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 17:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 17:43 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 06m 17s)
  • 17:36 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 06m 21s)
  • 17:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
  • 17:18 godog: wipe prometheus@k8s eqiad WAL and restart - T354399
  • 17:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
  • 17:15 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:15 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:14 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
  • 17:14 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp
  • 17:12 ladsgroup@deploy2002: Finished scap: Backport for Undeploy Listings extension part III (T253216) (duration: 08m 01s)
  • 17:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 17:06 ladsgroup@deploy2002: ladsgroup: Backport for Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:04 ladsgroup@deploy2002: Started scap: Backport for Undeploy Listings extension part III (T253216)
  • 17:04 ladsgroup@deploy2002: Finished scap: Backport for Undeploy Listings extension part III (T253216) (duration: 12m 24s)
  • 17:00 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
  • 16:57 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 16:54 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1377.eqiad.wmnet with OS bullseye
  • 16:53 ladsgroup@deploy2002: ladsgroup: Backport for Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2034.codfw.wmnet
  • 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2034.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
  • 16:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
  • 16:51 ladsgroup@deploy2002: Started scap: Backport for Undeploy Listings extension part III (T253216)
  • 16:49 ladsgroup@deploy2002: Finished scap: Backport for Undeploy Listings extension part III (T253216) (duration: 08m 47s)
  • 16:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
  • 16:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2034.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
  • 16:46 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp
  • 16:44 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin and not P{cp[5030,5032].eqsin.wmnet} and A:cp
  • 16:43 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 16:42 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:42 ladsgroup@deploy2002: ladsgroup: Backport for Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:41 ladsgroup@deploy2002: Started scap: Backport for Undeploy Listings extension part III (T253216)
  • 16:37 pt1979@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2034.codfw.wmnet
  • 16:36 btullis@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dbstore1008.eqiad.wmnet on all recursors
  • 16:36 btullis@cumin1002: START - Cookbook sre.dns.wipe-cache dbstore1008.eqiad.wmnet on all recursors
  • 16:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
  • 16:35 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:35 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove unwanted AAAA records from new dbstore hosts - btullis@cumin1002"
  • 16:34 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove unwanted AAAA records from new dbstore hosts - btullis@cumin1002"
  • 16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2033.codfw.wmnet
  • 16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
  • 16:32 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
  • 16:30 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 16:25 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:25 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and not P{cp[5030,5032].eqsin.wmnet} and A:cp
  • 16:25 ladsgroup@deploy2002: Finished scap: Backport for Undeploy Listings extension part III (T253216) (duration: 24m 06s)
  • 16:24 taavi: lvs1018: sudo ipvsadm --delete-service --tcp-service 208.80.154.243:3311 (and all the way to :3318) - T346947
  • 16:23 taavi: lvs1018: sudo ipvsadm --delete-service --tcp-service 208.80.154.242:3311 (and all the way to :3318) - T346947
  • 16:21 taavi: lvs1020: sudo ipvsadm --delete-service --tcp-service 208.80.154.243:3311 (and all the way to :3318) - T346947
  • 16:20 taavi: lvs1020: sudo ipvsadm --delete-service --tcp-service 208.80.154.242:3311 (and all the way to :3318) - T346947
  • 16:18 pt1979@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2033.codfw.wmnet
  • 16:15 taavi: restart pybal on lvs1018 - T346947
  • 16:14 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 16:14 ladsgroup@deploy2002: ladsgroup: Backport for Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:09 taavi: restart pybal on lvs1020 - T346947
  • 16:01 ladsgroup@deploy2002: Started scap: Backport for Undeploy Listings extension part III (T253216)
  • 15:59 sfaci@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 15:59 sfaci@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 15:58 ladsgroup@deploy2002: Finished scap: Backport for Undeploy listing extension part II (T253216) (duration: 08m 40s)
  • 15:57 sfaci@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 15:57 sfaci@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 15:52 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 15:51 ladsgroup@deploy2002: ladsgroup: Backport for Undeploy listing extension part II (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:49 ladsgroup@deploy2002: Started scap: Backport for Undeploy listing extension part II (T253216)
  • 15:48 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw1377.eqiad.wmnet with reason: reboot debugging
  • 15:48 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw1377.eqiad.wmnet with reason: reboot debugging
  • 15:47 ladsgroup@deploy2002: Finished scap: Backport for Undeploy Listings extension, part I (T253216) (duration: 08m 22s)
  • 15:46 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 15:46 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:45 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 15:41 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 15:40 ladsgroup@deploy2002: ladsgroup: Backport for Undeploy Listings extension, part I (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:40 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:38 ladsgroup@deploy2002: Started scap: Backport for Undeploy Listings extension, part I (T253216)
  • 15:35 claime: Draining and cordoning kubestage2002.codfw.wmnet - T352883
  • 15:32 krinkle@deploy2002: Finished scap: Backport for Fix parsing logic when comments or hidden characters are present (T354385) (duration: 07m 52s)
  • 15:26 krinkle@deploy2002: krinkle: Continuing with sync
  • 15:26 krinkle@deploy2002: krinkle: Backport for Fix parsing logic when comments or hidden characters are present (T354385) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:24 krinkle@deploy2002: Started scap: Backport for Fix parsing logic when comments or hidden characters are present (T354385)
  • 14:46 urbanecm@deploy2002: Finished scap: Backport for Add agent.app_install_id to android.product_metrics.* streams (T353680), Remove partial migration of EditAttemptStep instrument (T351335), Add new stream names to the config variable (T353297), agent.app_ -> agent_app_ in android.product_metrics.* streams (T353680) (duration: 10m 22s)
  • 14:40 urbanecm@deploy2002: urbanecm and phuedx and ksarabia and sfaci: Continuing with sync
  • 14:37 urbanecm@deploy2002: urbanecm and phuedx and ksarabia and sfaci: Backport for Add agent.app_install_id to android.product_metrics.* streams (T353680), Remove partial migration of EditAttemptStep instrument (T351335), Add new stream names to the config variable (T353297), agent.app_ -> agent_app_ in android.product_metrics.* streams (T353680) synce
  • 14:35 urbanecm@deploy2002: Started scap: Backport for Add agent.app_install_id to android.product_metrics.* streams (T353680), Remove partial migration of EditAttemptStep instrument (T351335), Add new stream names to the config variable (T353297), agent.app_ -> agent_app_ in android.product_metrics.* streams (T353680)
  • 14:34 urbanecm@deploy2002: Sync cancelled.
  • 14:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
  • 14:27 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
  • 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54548 and previous config saved to /var/cache/conftool/dbconfig/20240108-141717-root.json
  • 14:14 urbanecm@deploy2002: urbanecm and phuedx and ksarabia and sfaci: Backport for Add agent.app_install_id to android.product_metrics.* streams (T353680), Remove partial migration of EditAttemptStep instrument (T351335), Add new stream names to the config variable (T353297) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:13 urbanecm@deploy2002: Started scap: Backport for Add agent.app_install_id to android.product_metrics.* streams (T353680), Remove partial migration of EditAttemptStep instrument (T351335), Add new stream names to the config variable (T353297)
  • 14:12 urbanecm@deploy2002: Finished scap: Backport for enable page_rerender for 3rd batch of wikis (T351503) (duration: 09m 35s)
  • 14:06 urbanecm@deploy2002: pfischer and urbanecm: Continuing with sync
  • 14:04 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 14:04 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 14:04 urbanecm@deploy2002: pfischer and urbanecm: Backport for enable page_rerender for 3rd batch of wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:02 urbanecm@deploy2002: Started scap: Backport for enable page_rerender for 3rd batch of wikis (T351503)
  • 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54547 and previous config saved to /var/cache/conftool/dbconfig/20240108-140212-root.json
  • 14:01 moritzm: installing curl security updates
  • 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54546 and previous config saved to /var/cache/conftool/dbconfig/20240108-134707-root.json
  • 13:33 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 13:33 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54545 and previous config saved to /var/cache/conftool/dbconfig/20240108-133202-root.json
  • 13:32 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:31 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54544 and previous config saved to /var/cache/conftool/dbconfig/20240108-133016-root.json
  • 13:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54543 and previous config saved to /var/cache/conftool/dbconfig/20240108-131657-root.json
  • 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54542 and previous config saved to /var/cache/conftool/dbconfig/20240108-131511-root.json
  • 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54541 and previous config saved to /var/cache/conftool/dbconfig/20240108-130152-root.json
  • 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54540 and previous config saved to /var/cache/conftool/dbconfig/20240108-130006-root.json
  • 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54539 and previous config saved to /var/cache/conftool/dbconfig/20240108-124647-root.json
  • 12:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1224.eqiad.wmnet with OS bookworm
  • 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54538 and previous config saved to /var/cache/conftool/dbconfig/20240108-124501-root.json
  • 12:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54537 and previous config saved to /var/cache/conftool/dbconfig/20240108-122956-root.json
  • 12:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
  • 12:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
  • 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54536 and previous config saved to /var/cache/conftool/dbconfig/20240108-121451-root.json
  • 12:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS bookworm
  • 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1224 T354506', diff saved to https://phabricator.wikimedia.org/P54535 and previous config saved to /var/cache/conftool/dbconfig/20240108-120759-root.json
  • 12:03 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 45287
  • 12:02 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 45287
  • 12:02 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 35847
  • 12:02 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 35847
  • 12:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9902
  • 12:00 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 9902
  • 12:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2117.codfw.wmnet with OS bookworm
  • 11:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54534 and previous config saved to /var/cache/conftool/dbconfig/20240108-115946-root.json
  • 11:57 ladsgroup@deploy2002: Finished scap: Backport for Disable Listings extension everywhere except rowikivoyage (T253216) (duration: 08m 43s)
  • 11:50 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 11:50 ladsgroup@deploy2002: ladsgroup: Backport for Disable Listings extension everywhere except rowikivoyage (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:48 ladsgroup@deploy2002: Started scap: Backport for Disable Listings extension everywhere except rowikivoyage (T253216)
  • 11:45 taavi@deploy2002: Finished scap: Backport for OATHAuthServices: Fix service name (T354505), Fix disabling two-factor authentication (T354505) (duration: 09m 21s)
  • 11:39 taavi@deploy2002: taavi: Continuing with sync
  • 11:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2117.codfw.wmnet with reason: host reimage
  • 11:38 taavi@deploy2002: taavi: Backport for OATHAuthServices: Fix service name (T354505), Fix disabling two-factor authentication (T354505) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:36 taavi@deploy2002: Started scap: Backport for OATHAuthServices: Fix service name (T354505), Fix disabling two-factor authentication (T354505)
  • 11:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2117.codfw.wmnet with reason: host reimage
  • 11:29 ladsgroup@deploy2002: Finished scap: Backport for Stop writing to the old columns of pagelinks in testwiki (T352010) (duration: 10m 02s)
  • 11:23 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 11:20 ladsgroup@deploy2002: ladsgroup: Backport for Stop writing to the old columns of pagelinks in testwiki (T352010) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:19 ladsgroup@deploy2002: Started scap: Backport for Stop writing to the old columns of pagelinks in testwiki (T352010)
  • 11:17 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2117.codfw.wmnet with OS bookworm
  • 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2117 T354506', diff saved to https://phabricator.wikimedia.org/P54533 and previous config saved to /var/cache/conftool/dbconfig/20240108-111452-root.json
  • 10:36 XioNoX: repool eqsin - T332395
  • 10:33 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:21 ladsgroup@deploy2002: Finished scap: Backport for styles: Replace obsolete WikimediaUI Base var with Codex alias (duration: 07m 32s)
  • 10:20 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:20 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 10:15 ladsgroup@deploy2002: volker-e and ladsgroup: Continuing with sync
  • 10:15 ladsgroup@deploy2002: volker-e and ladsgroup: Backport for styles: Replace obsolete WikimediaUI Base var with Codex alias synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:14 ladsgroup@deploy2002: Started scap: Backport for styles: Replace obsolete WikimediaUI Base var with Codex alias
  • 10:11 ladsgroup@deploy2002: Finished scap: Backport for Set commonswiki pagelinks migration stage to READ NEW (T351237) (duration: 08m 52s)
  • 10:05 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 10:04 ladsgroup@deploy2002: ladsgroup: Backport for Set commonswiki pagelinks migration stage to READ NEW (T351237) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:02 ladsgroup@deploy2002: Started scap: Backport for Set commonswiki pagelinks migration stage to READ NEW (T351237)
  • 09:54 XioNoX: asw1-eqsin> request system reboot - T332395
  • 09:32 Emperor: reboot ms-be2074-80 before adding them to the rings T353149
  • 09:32 Emperor: reboot ms-be1072-82 before adding them to the rings T353149
  • 09:24 XioNoX: start install process on asw1-eqsin - T332395
  • 09:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 35 hosts with reason: eqsin switch upgrade
  • 09:04 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 35 hosts with reason: eqsin switch upgrade
  • 09:03 XioNoX: depool eqsin for switch upgrade - T332395
  • 08:27 xSavitar: UTC morning backport window done.
  • 08:26 derick@deploy2002: Finished scap: Backport for wmf-config: Remove unused wgStatsCacheType setting (T336004) (duration: 09m 11s)
  • 08:20 derick@deploy2002: derick and d3r1ck01: Continuing with sync
  • 08:18 derick@deploy2002: derick and d3r1ck01: Backport for wmf-config: Remove unused wgStatsCacheType setting (T336004) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:17 derick@deploy2002: Started scap: Backport for wmf-config: Remove unused wgStatsCacheType setting (T336004)

2024-01-06

  • 22:27 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 22:27 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 22:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.

2024-01-05

  • 23:49 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit.wikimedia.org only this deploy) (duration: 00m 08s)
  • 23:49 thcipriani@deploy2002: Started deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit.wikimedia.org only this deploy)
  • 23:31 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit-replicas only this deploy) (duration: 00m 06s)
  • 23:31 thcipriani@deploy2002: Started deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit-replicas only this deploy)
  • 23:25 thcipriani: deploying gerrit to remove survey banner https://gerrit.wikimedia.org/r/987995 (no downtime needed)
  • 19:29 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for restbase2034.codfw.wmnet
  • 19:29 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for restbase2034.codfw.wmnet
  • 19:23 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase2034.codfw.wmnet
  • 19:07 mutante: vrts1001 - sudo systemctl start clamav-daemon
  • 17:14 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:43 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 16:42 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 16:40 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 16:30 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 16:29 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 16:19 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:31 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 15:30 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 14:50 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 14:50 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 14:45 milimetric@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 14:45 milimetric@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 14:43 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:42 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:38 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:37 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:14 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 14:14 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:42 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:41 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:23 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 13:23 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 11:56 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw1379.eqiad.wmnet
  • 11:49 kamila@cumin1002: START - Cookbook sre.hosts.reboot-single for host mw1379.eqiad.wmnet
  • 09:26 moritzm: installing 5.10.205 kernels on Bullseye hosts
  • 09:15 _joe_: upgrading conftool across the fleet
  • 08:01 moritzm: installing 6.1.69 kernels on Bookworm hosts
  • 01:27 zabe: zabe@mwmaint2002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=arzwiki --logwiki=metawiki 'WanderingPlaywrite' 'WanderingPlaywright' # T354397
  • 00:59 cwhite: restarted prometheus@k8s on prometheus1006 and backed up the wal for OOM loop investigation
  • 00:52 cwhite: restarted prometheus@k8s on prometheus1005 and backed up the wal for OOM loop investigation

2024-01-04

  • 23:10 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:10 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:34 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:33 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:31 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:31 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:29 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:29 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:29 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:29 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:25 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:25 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:24 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 22:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 22:22 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 22:22 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 22:22 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 22:21 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 22:21 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 22:21 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 22:00 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 22:00 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 21:38 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:38 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:27 brennen: end of utc late backport window
  • 21:26 brennen@deploy2002: Finished scap: Backport for Ensure all non-okay statuses from ::getImageContents have a message (T354374) (duration: 08m 01s)
  • 21:20 brennen@deploy2002: brennen and dreamyjazz: Continuing with sync
  • 21:19 brennen@deploy2002: brennen and dreamyjazz: Backport for Ensure all non-okay statuses from ::getImageContents have a message (T354374) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:18 brennen@deploy2002: Started scap: Backport for Ensure all non-okay statuses from ::getImageContents have a message (T354374)
  • 21:17 brennen@deploy2002: Finished scap: Backport for Check for invalid JSON on a good response from PhotoDNA (T354370) (duration: 07m 57s)
  • 21:11 brennen@deploy2002: brennen and dreamyjazz: Continuing with sync
  • 21:10 brennen@deploy2002: brennen and dreamyjazz: Backport for Check for invalid JSON on a good response from PhotoDNA (T354370) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:09 brennen@deploy2002: Started scap: Backport for Check for invalid JSON on a good response from PhotoDNA (T354370)
  • 20:41 ryankemper: [apifeatureusage] T350703 Restarted `logstash` on `apifeatureusage[1,2]001`
  • 20:39 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.12 refs T350088
  • 20:30 mutante: mwmaint2002 - /usr/local/sbin/sync-home-mwmaint after gerrit:987778
  • 20:20 dduvall@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.12 refs T350088 (duration: 06m 09s)
  • 20:16 ejegg: standalone (payments listener) SmashPig upgraded from fc74ccca to 20d6434e
  • 20:13 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.12 refs T350088
  • 20:03 mutante: releases2003 - systemctl status rsync-srv-org-wikimedia-releases-releases2003.codfw.wmnet after gerrit:987436
  • 20:01 mutante: releases2003 - systemctl start rsync-srv-patches-releases2003.codfw.wmnet after gerrit:987436
  • 19:59 brett: restarting pybal on lvs5006 for testing purposes - T353760
  • 19:59 mutante: releases1003 - systemctl start rsync-srv-patches-releases-primary after gerrit:987436
  • 19:57 dcausse: repooling wdqs1019
  • 19:52 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:51 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:49 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.12 refs T350088
  • 19:47 mutante: deploy1002 - systemctl start rsync-patches_module after gerrit:987436
  • 19:32 dduvall@deploy2002: Finished scap: Backport for Revise logic for creating compact links button on Vector 2022 (T353850) (duration: 07m 58s)
  • 19:26 dduvall@deploy2002: jdlrobson and dduvall: Continuing with sync
  • 19:26 dduvall@deploy2002: jdlrobson and dduvall: Backport for Revise logic for creating compact links button on Vector 2022 (T353850) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:25 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 19:25 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 19:24 dduvall@deploy2002: Started scap: Backport for Revise logic for creating compact links button on Vector 2022 (T353850)
  • 19:22 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 19:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 19:04 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 19:04 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 18:46 sukhe: [second time] mx2001: exiqgrep -i -r w*@gmail.com | xargs exim -Mrm
  • 18:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
  • 17:57 sukhe: mx2001: exiqgrep -i -r w*@gmail.com | xargs exim -Mrm
  • 17:46 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
  • 17:43 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
  • 17:42 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 17:42 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 17:35 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 17:34 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 17:28 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
  • 17:10 oblivian@puppetmaster2001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=kubernetes,service=kubesvc,name=mw1377.*
  • 16:43 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:42 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:42 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:36 volans@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw1378.eqiad.wmnet
  • 16:25 volans@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw1378.eqiad.wmnet
  • 16:00 volans@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts mw1378.eqiad.wmnet
  • 15:59 volans@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw1378.eqiad.wmnet
  • 15:58 moritzm: installing libdatetime-timezone-perl updates
  • 15:51 moritzm: rolling restart of FPM/apache on mw canaries to pick up curl updates
  • 15:48 XioNoX: repool esams - T346779
  • 15:46 volans@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw1378.eqiad.wmnet
  • 15:38 XioNoX: undrain esams-eqiad transport - T346779
  • 15:37 XioNoX: re-enable peering/transit on cr1-esams - T346779
  • 15:35 volans@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw1378.eqiad.wmnet
  • 15:30 XioNoX: reboot fpc0 on cr1-esams - T346779
  • 15:29 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw1378.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 15:26 XioNoX: disable peering/transit on cr1-esams for linecard reboot - T346779
  • 15:19 volans: running sre.hosts.provision for mw1378 - T351074
  • 15:19 volans@cumin2002: START - Cookbook sre.hosts.provision for host mw1378.mgmt.eqiad.wmnet with reboot policy GRACEFUL
  • 15:16 XioNoX: drain esams-eqiad transport - T346779
  • 15:14 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:13 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:12 moritzm: installing curl security updates
  • 15:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:08 volans: rebooting mw1378 (downtimed and depooled) to debug reboot issues afer reimage - T351074
  • 15:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:07 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:07 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:05 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:04 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:04 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:01 XioNoX: depool esams for router work - T346779
  • 15:00 tchanders@deploy2002: Finished scap: Backport for enable page_rerender for 2nd batch: dewiki, frwiktionary, and kuwiktionary (duration: 17m 55s)
  • 14:59 volans: rebooting mw1378 (downtimed and depooled) to debug reboot issues afer reimage - T351074
  • 14:56 volans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw1378.eqiad.wmnet with reason: WIP hosts to be setup
  • 14:56 volans@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on mw1378.eqiad.wmnet with reason: WIP hosts to be setup
  • 14:54 tchanders@deploy2002: pfischer and tchanders: Continuing with sync
  • 14:45 tchanders@deploy2002: pfischer and tchanders: Backport for enable page_rerender for 2nd batch: dewiki, frwiktionary, and kuwiktionary synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:42 tchanders@deploy2002: Started scap: Backport for enable page_rerender for 2nd batch: dewiki, frwiktionary, and kuwiktionary
  • 14:40 tchanders@deploy2002: Finished scap: Backport for Attempt to send original file to PhotoDNA if no thumbnail (T353854) (duration: 09m 25s)
  • 14:34 tchanders@deploy2002: tchanders and dreamyjazz: Continuing with sync
  • 14:34 tchanders@deploy2002: tchanders and dreamyjazz: Backport for Attempt to send original file to PhotoDNA if no thumbnail (T353854) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:30 tchanders@deploy2002: Started scap: Backport for Attempt to send original file to PhotoDNA if no thumbnail (T353854)
  • 14:25 tchanders@deploy2002: Finished scap: Backport for Attempt to send original file to PhotoDNA if no thumbnail (T353854) (duration: 09m 24s)
  • 14:20 tchanders@deploy2002: dreamyjazz and tchanders: Continuing with sync
  • 14:20 tchanders@deploy2002: dreamyjazz and tchanders: Backport for Attempt to send original file to PhotoDNA if no thumbnail (T353854) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:16 tchanders@deploy2002: Started scap: Backport for Attempt to send original file to PhotoDNA if no thumbnail (T353854)
  • 14:12 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:12 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:09 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:09 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:09 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:03 XioNoX: repool drmrs - T354340
  • 14:01 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:00 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:00 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:57 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 2686
  • 13:56 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 2686
  • 13:53 moritzm: installing libssh security updates
  • 13:24 dcausse: restarting blazegraph on wdqs1019 (stuck with high thread count)
  • 13:07 zabe@deploy2002: Finished scap: Backport for Revert "Get blocks from DatabaseBlockStore instead of doing our own query" (T353620), Revert "Support new block schema" (T354298) (duration: 10m 06s)
  • 13:02 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host mw1377.eqiad.wmnet
  • 13:02 XioNoX: depool drmrs for router work - T354340
  • 13:01 zabe@deploy2002: zabe: Continuing with sync
  • 13:00 zabe@deploy2002: zabe: Backport for Revert "Get blocks from DatabaseBlockStore instead of doing our own query" (T353620), Revert "Support new block schema" (T354298) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:56 zabe@deploy2002: Started scap: Backport for Revert "Get blocks from DatabaseBlockStore instead of doing our own query" (T353620), Revert "Support new block schema" (T354298)
  • 12:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 63296
  • 12:52 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 63296
  • 12:10 kamila@cumin1002: START - Cookbook sre.hosts.reboot-single for host mw1377.eqiad.wmnet
  • 12:04 moritzm: installing lua5.3 security updates
  • 11:52 moritzm: installing libde265 security updates
  • 11:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1379.eqiad.wmnet with OS bullseye
  • 11:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
  • 11:16 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
  • 11:01 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
  • 10:51 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 10:33 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 10:32 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 10:17 akosiaris: bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi) take #3
  • 10:17 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 09:57 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 09:38 akosiaris: delete mw1377-mw1383 from eqiad wikikube nodes
  • 09:38 akosiaris: bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi) take #2
  • 09:36 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 09:36 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 09:22 akosiaris: bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi)
  • 09:22 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 09:13 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:13 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:12 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:11 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:09 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:49 ladsgroup@deploy2002: Finished scap: Backport for Update virtual domain for url shortener (duration: 12m 35s)
  • 08:43 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 08:38 ladsgroup@deploy2002: ladsgroup: Backport for Update virtual domain for url shortener synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:36 ladsgroup@deploy2002: Started scap: Backport for Update virtual domain for url shortener
  • 08:34 ladsgroup@deploy2002: Finished scap: Backport for Add virtual domain config for reading lists extension (T353948) (duration: 09m 05s)
  • 08:28 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 08:27 ladsgroup@deploy2002: ladsgroup: Backport for Add virtual domain config for reading lists extension (T353948) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:25 ladsgroup@deploy2002: Started scap: Backport for Add virtual domain config for reading lists extension (T353948)
  • 07:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1151.eqiad.wmnet with OS bookworm
  • 06:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
  • 06:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
  • 06:28 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1151.eqiad.wmnet with OS bookworm
  • 03:49 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.

2024-01-03

  • 23:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
  • 23:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
  • 23:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
  • 23:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
  • 23:33 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 23:24 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 23:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 23:18 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1383.eqiad.wmnet with OS bullseye
  • 23:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1380.eqiad.wmnet with OS bullseye
  • 23:14 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1382.eqiad.wmnet with OS bullseye
  • 23:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1378.eqiad.wmnet with OS bullseye
  • 23:10 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1381.eqiad.wmnet with OS bullseye
  • 23:07 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1379.eqiad.wmnet with OS bullseye
  • 23:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
  • 23:01 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 22:59 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
  • 22:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
  • 22:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
  • 22:54 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
  • 22:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
  • 22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
  • 22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
  • 22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
  • 22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
  • 22:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
  • 22:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
  • 22:40 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 22:38 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1383.eqiad.wmnet with OS bullseye
  • 22:38 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1382.eqiad.wmnet with OS bullseye
  • 22:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1381.eqiad.wmnet with OS bullseye
  • 22:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1380.eqiad.wmnet with OS bullseye
  • 22:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
  • 22:36 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye
  • 22:36 bking@cumin2002: START - Cookbook sre.wdqs.restart
  • 22:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2087.codfw.wmnet with OS bullseye
  • 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2087.codfw.wmnet with reason: host reimage
  • 21:59 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2087.codfw.wmnet with reason: host reimage
  • 21:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
  • 21:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: broken reimage
  • 21:47 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: broken reimage
  • 21:43 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2087.codfw.wmnet with OS bullseye
  • 21:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
  • 21:34 zabe@deploy2002: Finished scap: Backport for Update mediawiki/mediawiki-codesniffer to 42.0.0 (duration: 10m 34s)
  • 21:33 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
  • 21:28 zabe@deploy2002: zabe: Continuing with sync
  • 21:27 zabe@deploy2002: zabe: Backport for Update mediawiki/mediawiki-codesniffer to 42.0.0 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:24 zabe@deploy2002: Started scap: Backport for Update mediawiki/mediawiki-codesniffer to 42.0.0
  • 21:19 TheresNoTime: UTC late backport window done
  • 21:18 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
  • 21:14 samtar@deploy2002: Finished scap: Backport for Add "patroller" user group to testwiki (T354063) (duration: 12m 19s)
  • 21:08 samtar@deploy2002: novemlinguae and samtar: Continuing with sync
  • 21:06 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1383.eqiad.wmnet with OS bullseye
  • 21:06 samtar@deploy2002: novemlinguae and samtar: Backport for Add "patroller" user group to testwiki (T354063) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:04 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1382.eqiad.wmnet with OS bullseye
  • 21:02 samtar@deploy2002: Started scap: Backport for Add "patroller" user group to testwiki (T354063)
  • 20:59 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1381.eqiad.wmnet with OS bullseye
  • 20:47 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1380.eqiad.wmnet with OS bullseye
  • 20:45 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1379.eqiad.wmnet with OS bullseye
  • 20:37 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1378.eqiad.wmnet with OS bullseye
  • 20:34 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1377.eqiad.wmnet with OS bullseye
  • 20:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2450.codfw.wmnet with OS bullseye
  • 20:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2443.codfw.wmnet with OS bullseye
  • 20:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2451.codfw.wmnet with OS bullseye
  • 20:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2442.codfw.wmnet with OS bullseye
  • 20:00 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
  • 19:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2436.codfw.wmnet with OS bullseye
  • 19:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
  • 19:57 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2450.codfw.wmnet with reason: host reimage
  • 19:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2440.codfw.wmnet with OS bullseye
  • 19:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2443.codfw.wmnet with reason: host reimage
  • 19:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2437.codfw.wmnet with OS bullseye
  • 19:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
  • 19:51 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2451.codfw.wmnet with reason: host reimage
  • 19:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2451.codfw.wmnet with reason: host reimage
  • 19:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2450.codfw.wmnet with reason: host reimage
  • 19:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2443.codfw.wmnet with reason: host reimage
  • 19:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
  • 19:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
  • 19:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
  • 19:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2442.codfw.wmnet with reason: host reimage
  • 19:42 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
  • 19:39 mutante: root@doc2002: /usr/local/sbin/sync-doc-host-data-sync after gerrit:987406
  • 19:39 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
  • 19:38 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2442.codfw.wmnet with reason: host reimage
  • 19:36 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
  • 19:36 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2440.codfw.wmnet with reason: host reimage
  • 19:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2436.codfw.wmnet with reason: host reimage
  • 19:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1383.eqiad.wmnet with OS bullseye
  • 19:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2440.codfw.wmnet with reason: host reimage
  • 19:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
  • 19:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1382.eqiad.wmnet with OS bullseye
  • 19:34 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1381.eqiad.wmnet with OS bullseye
  • 19:33 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2451.codfw.wmnet with OS bullseye
  • 19:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2437.codfw.wmnet with reason: host reimage
  • 19:33 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2450.codfw.wmnet with OS bullseye
  • 19:32 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2443.codfw.wmnet with OS bullseye
  • 19:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
  • 19:28 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2437.codfw.wmnet with reason: host reimage
  • 19:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
  • 19:26 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2436.codfw.wmnet with reason: host reimage
  • 19:26 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
  • 19:25 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
  • 19:22 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1380.eqiad.wmnet with OS bullseye
  • 19:21 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
  • 19:19 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2442.codfw.wmnet with OS bullseye
  • 19:18 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2440.codfw.wmnet with OS bullseye
  • 19:11 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye
  • 19:11 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
  • 19:10 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2437.codfw.wmnet with OS bullseye
  • 19:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2436.codfw.wmnet with OS bullseye
  • 18:27 brennen@deploy2002: Finished deploy [phabricator/deployment@369e797]: deploy to phab2002 for T334519 (duration: 00m 27s)
  • 18:27 brennen@deploy2002: Started deploy [phabricator/deployment@369e797]: deploy to phab2002 for T334519
  • 18:27 brennen: running an essentially no-op phab2002 deploy
  • 18:11 dduvall@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.12 refs T350088 (duration: 07m 23s)
  • 18:03 dduvall@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.12 refs T350088
  • 17:06 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and not P{cp4044.ulsfo.wmnet} and A:cp
  • 16:45 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and not P{cp4044.ulsfo.wmnet} and A:cp
  • 16:33 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and not P{cp4050.ulsfo.wmnet,cp4051.ulsfo.wmnet} and A:cp
  • 16:27 stevemunene@deploy2002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
  • 16:27 stevemunene@deploy2002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
  • 16:27 stevemunene@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
  • 16:26 stevemunene@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
  • 16:26 stevemunene@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 16:26 stevemunene@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 16:25 stevemunene@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 16:25 stevemunene@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 16:24 stevemunene@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 16:24 stevemunene@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 16:23 stevemunene@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 16:22 stevemunene@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 16:16 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and not P{cp4050.ulsfo.wmnet,cp4051.ulsfo.wmnet} and A:cp
  • 16:11 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp3066.esams.wmnet} and A:cp
  • 16:10 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp3066.esams.wmnet} and A:cp
  • 15:39 moritzm: rebuild md RAIDs after disk swap T353324
  • 14:55 TheresNoTime: UTC afternoon backport window done
  • 14:54 samtar@deploy2002: Finished scap: Backport for zhwikinews: update wordmark (T353792) (duration: 09m 11s)
  • 14:48 samtar@deploy2002: anzx and samtar: Continuing with sync
  • 14:46 samtar@deploy2002: anzx and samtar: Backport for zhwikinews: update wordmark (T353792) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:45 samtar@deploy2002: Started scap: Backport for zhwikinews: update wordmark (T353792)
  • 14:43 samtar@deploy2002: Finished scap: Backport for aswikiquote: change wordmark and update logo (T353934) (duration: 07m 51s)
  • 14:38 samtar@deploy2002: samtar and anzx: Continuing with sync
  • 14:37 samtar@deploy2002: samtar and anzx: Backport for aswikiquote: change wordmark and update logo (T353934) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:36 samtar@deploy2002: Started scap: Backport for aswikiquote: change wordmark and update logo (T353934)
  • 14:34 samtar@deploy2002: Finished scap: Backport for Edit Recovery: fix typo in expiry field name (T347673) (duration: 07m 46s)
  • 14:29 samtar@deploy2002: samtar: Continuing with sync
  • 14:28 samtar@deploy2002: samtar: Backport for Edit Recovery: fix typo in expiry field name (T347673) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:27 samtar@deploy2002: Started scap: Backport for Edit Recovery: fix typo in expiry field name (T347673)
  • 14:18 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:18 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:17 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:17 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:11 samtar@deploy2002: Finished scap: Backport for zhwikivoyage: Enable block feature for abusefilter (T353604), ganwiki: Add transwiki import sources (T354000) (duration: 09m 58s)
  • 14:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:06 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:05 samtar@deploy2002: samtar and stang: Continuing with sync
  • 14:03 moritzm: installing qemu security updates
  • 14:02 samtar@deploy2002: samtar and stang: Backport for zhwikivoyage: Enable block feature for abusefilter (T353604), ganwiki: Add transwiki import sources (T354000) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:01 samtar@deploy2002: Started scap: Backport for zhwikivoyage: Enable block feature for abusefilter (T353604), ganwiki: Add transwiki import sources (T354000)
  • 13:32 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Nick Ifeajika out of all services on: 2220 hosts
  • 13:31 root@cumin2002: START - Cookbook sre.idm.logout Logging Nick Ifeajika out of all services on: 2220 hosts
  • 13:29 moritzm: installing Java 8/11 security updates
  • 12:34 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:34 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:29 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot-master (exit_code=0) rolling restart_daemons on A:maps-master
  • 12:28 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot-master rolling restart_daemons on A:maps-master
  • 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
  • 12:18 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
  • 12:14 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
  • 12:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:08 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
  • 12:02 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:02 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:01 moritzm: installing gnutls28 security updates on buster
  • 11:47 oblivian@deploy2002: Finished scap: Backport for Fix timeouts detection on mw on k8s jobrunners (T354229) (duration: 11m 38s)
  • 11:44 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:44 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:41 oblivian@deploy2002: oblivian: Continuing with sync
  • 11:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:39 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:37 oblivian@deploy2002: oblivian: Backport for Fix timeouts detection on mw on k8s jobrunners (T354229) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:36 oblivian@deploy2002: Started scap: Backport for Fix timeouts detection on mw on k8s jobrunners (T354229)
  • 11:31 oblivian@deploy2002: Finished scap: Backport for Disable things that don't work on k8s when on k8s (duration: 15m 29s)
  • 11:25 oblivian@deploy2002: oblivian: Continuing with sync
  • 11:25 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:24 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:24 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:24 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:24 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:23 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:23 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:22 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:22 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:21 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:20 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:20 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:18 oblivian@deploy2002: oblivian: Backport for Disable things that don't work on k8s when on k8s synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:16 oblivian@deploy2002: Started scap: Backport for Disable things that don't work on k8s when on k8s
  • 11:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:56 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:53 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:51 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:51 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:48 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:48 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:46 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:46 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:35 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:16 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:15 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:11 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:11 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:10 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:09 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:57 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:39 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:36 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:36 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:35 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:33 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:33 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:32 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:31 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:31 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:21 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:21 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:10 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:10 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:03 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 01:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 01:13 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 01:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 00:55 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 00:55 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 00:08 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 00:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.

2024-01-02

  • 22:42 urbanecm: mwmaint2002: Restart `mwscript extensions/GrowthExperiments/maintenance/reassignMentees.php --wiki=enwiki --mentor 'FormalDude' --performer 'Martin Urbanec (WMF)'` (T354220)
  • 22:29 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2087.codfw.wmnet with OS bullseye
  • 21:08 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2087.codfw.wmnet with OS bullseye
  • 20:52 urbanecm: mwmaint2002: `mwscript extensions/GrowthExperiments/maintenance/reassignMentees.php --wiki=enwiki --mentor 'FormalDude' --performer 'Martin Urbanec (WMF)'` (T354220)
  • 20:32 mutante: phab2002 - synced /srv/homes tfrom phab1004 to /srv/homes on phab2002
  • 19:39 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.12 refs T350088
  • 18:29 mutante: confctl select 'name=mw2394.codfw.wmnet' set/pooled=inactive | T354193#9430654 - seems like 2396 was previously depooled instead of this 2394
  • 17:29 dancy@deploy2002: Installation of scap version "4.65.1" completed for 566 hosts
  • 17:28 dancy@deploy2002: Installing scap version "4.65.1" for 566 hosts
  • 17:26 dancy@deploy2002: Installing scap version "4.65.1" for 567 hosts
  • 14:59 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1008.eqiad.wmnet with OS bookworm
  • 14:58 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1009.eqiad.wmnet with OS bookworm
  • 14:44 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki=csbwiktionary --fix # T354114
  • 14:43 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1009.eqiad.wmnet with reason: host reimage
  • 14:40 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1009.eqiad.wmnet with reason: host reimage
  • 14:37 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1008.eqiad.wmnet with reason: host reimage
  • 14:34 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1008.eqiad.wmnet with reason: host reimage
  • 14:32 _joe_: confctl select 'name=mw2396.codfw.wmnet' set/pooled=inactive
  • 14:26 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host dbstore1009.eqiad.wmnet with OS bookworm
  • 14:20 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host dbstore1008.eqiad.wmnet with OS bookworm
  • 14:16 urbanecm@deploy2002: Finished scap: Backport for cswiki: Grant patrolmarks to autopatrolled (T354004), csbwiktionary: Set MetaNamespaceName to Wikisłowôrz (T354114) (duration: 13m 46s)
  • 14:04 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 14:04 urbanecm@deploy2002: urbanecm: Backport for cswiki: Grant patrolmarks to autopatrolled (T354004), csbwiktionary: Set MetaNamespaceName to Wikisłowôrz (T354114) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:02 urbanecm@deploy2002: Started scap: Backport for cswiki: Grant patrolmarks to autopatrolled (T354004), csbwiktionary: Set MetaNamespaceName to Wikisłowôrz (T354114)
  • 10:55 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp4044.ulsfo.wmnet,cp4050.ulsfo.wmnet} and A:cp
  • 10:50 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp4044.ulsfo.wmnet,cp4050.ulsfo.wmnet} and A:cp
  • 10:38 vgutierrez: fetching haproxy 2.6.16 for thirdparty/haproxy26 bullseye-wikimedia (apt.wm.o)
  • 09:23 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Commissioning new database server
  • 09:23 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Commissioning new database server
  • 09:17 pfischer@deploy2002: Finished scap: Backport for configure message_key_fields for update_pipeline (duration: 15m 35s)
  • 09:05 pfischer@deploy2002: pfischer: Continuing with sync
  • 09:04 pfischer@deploy2002: pfischer: Backport for configure message_key_fields for update_pipeline synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:02 moritzm: installing nodejs security updates on bookworm
  • 09:02 pfischer@deploy2002: Started scap: Backport for configure message_key_fields for update_pipeline
  • 08:33 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2448.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 08:27 jayme: restart prometheus@k8s prometheus@k8s-aux in eqiad - T343529
  • 08:26 akosiaris@cumin1001: START - Cookbook sre.hosts.provision for host mw2448.mgmt.codfw.wmnet with reboot policy GRACEFUL
  • 06:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2144.codfw.wmnet with OS bookworm
  • 06:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
  • 06:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
  • 06:06 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2144.codfw.wmnet with OS bookworm
  • 05:00 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.12 refs T350088 (duration: 56m 48s)
  • 04:03 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.12 refs T350088

2024-01-01

  • 21:38 eileen: config revision changed from 026cf508 to 21b91455
  • 21:13 eileen: config revision changed from 3a1a1444 to 026cf508
  • 21:13 eileen: fork/mapping-edit-button-fix
  • 17:11 joal@deploy2002: Finished deploy [airflow-dags/analytics@8b8a456]: Fix monthly job [airflow-dags/analytics@8b8a4567] (duration: 00m 31s)
  • 17:11 joal@deploy2002: Started deploy [airflow-dags/analytics@8b8a456]: Fix monthly job [airflow-dags/analytics@8b8a4567]


Other archives

2000s

2010s

2020s