Server Admin Log/Archive 75

2024-01-31

23:11 eileen: * civicrm upgraded from 6344c95e to 6e1e0d21
22:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
22:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
22:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T355609)', diff saved to https://phabricator.wikimedia.org/P56010 and previous config saved to /var/cache/conftool/dbconfig/20240131-222853-marostegui.json
22:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P56009 and previous config saved to /var/cache/conftool/dbconfig/20240131-221347-marostegui.json
22:11 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 06m 43s)
22:05 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 07m 26s)
21:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P56008 and previous config saved to /var/cache/conftool/dbconfig/20240131-215840-marostegui.json
21:54 Dreamy_Jazz: Removed already applied patches for T347708 from /srv/patches
21:48 dancy@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.16 refs T354434 (duration: 06m 47s)
21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T355609)', diff saved to https://phabricator.wikimedia.org/P56007 and previous config saved to /var/cache/conftool/dbconfig/20240131-214334-marostegui.json
21:42 dancy@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.16 refs T354434
21:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T355609)', diff saved to https://phabricator.wikimedia.org/P56006 and previous config saved to /var/cache/conftool/dbconfig/20240131-213454-marostegui.json
21:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
21:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
21:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T355609)', diff saved to https://phabricator.wikimedia.org/P56005 and previous config saved to /var/cache/conftool/dbconfig/20240131-213432-marostegui.json
21:31 Dreamy_Jazz: Security deploy done
21:30 logmsgbot: dreamyjazz Deployed security patch for T356226
21:23 logmsgbot: dreamyjazz Deployed security patch for T356226
21:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P56004 and previous config saved to /var/cache/conftool/dbconfig/20240131-211926-marostegui.json
21:16 Dreamy_Jazz: Doing security deploy for T356226
21:12 jforrester@deploy2002: Finished scap: Backport for Gadget: Bump GADGET_CLASS_VERSION (T356322) (duration: 08m 31s)
21:05 jforrester@deploy2002: jforrester and reedy: Continuing with sync
21:05 jforrester@deploy2002: jforrester and reedy: Backport for Gadget: Bump GADGET_CLASS_VERSION (T356322) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P56003 and previous config saved to /var/cache/conftool/dbconfig/20240131-210419-marostegui.json
21:03 jforrester@deploy2002: Started scap: Backport for Gadget: Bump GADGET_CLASS_VERSION (T356322)
20:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T355609)', diff saved to https://phabricator.wikimedia.org/P56002 and previous config saved to /var/cache/conftool/dbconfig/20240131-204913-marostegui.json
20:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T355609)', diff saved to https://phabricator.wikimedia.org/P56001 and previous config saved to /var/cache/conftool/dbconfig/20240131-204439-marostegui.json
20:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
20:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
20:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
20:37 eevans@deploy2002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
20:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
20:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T355609)', diff saved to https://phabricator.wikimedia.org/P56000 and previous config saved to /var/cache/conftool/dbconfig/20240131-203704-marostegui.json
20:36 eevans@deploy2002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
20:36 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: sync
20:35 eevans@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: sync
20:35 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: sync
20:35 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: sync
20:33 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript userOptions.php --wiki=testwiki --old-is-default --old=2 --new 1 --nowarn 'echo-subscriptions-web-reverted' # T353225
20:32 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
20:31 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
20:28 joal@deploy2002: Finished deploy [analytics/refinery@b738b3f] (hadoop-test): HOTFIX analytics weekly train - Test [analytics/refinery@b738b3fd] (duration: 03m 35s)
20:28 eevans@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
20:27 eevans@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
20:25 joal@deploy2002: Started deploy [analytics/refinery@b738b3f] (hadoop-test): HOTFIX analytics weekly train - Test [analytics/refinery@b738b3fd]
20:24 joal@deploy2002: Finished deploy [analytics/refinery@b738b3f] (thin): HOTFIX analytics weekly train -THIN [analytics/refinery@b738b3fd] (duration: 00m 05s)
20:24 joal@deploy2002: Started deploy [analytics/refinery@b738b3f] (thin): HOTFIX analytics weekly train -THIN [analytics/refinery@b738b3fd]
20:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P55999 and previous config saved to /var/cache/conftool/dbconfig/20240131-202158-marostegui.json
20:10 joal@deploy2002: Finished deploy [analytics/refinery@b738b3f]: HOTFIX analytics weekly train [analytics/refinery@b738b3fd] (duration: 10m 51s)
20:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P55998 and previous config saved to /var/cache/conftool/dbconfig/20240131-200652-marostegui.json
19:59 joal@deploy2002: Started deploy [analytics/refinery@b738b3f]: HOTFIX analytics weekly train [analytics/refinery@b738b3fd]
19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T355609)', diff saved to https://phabricator.wikimedia.org/P55997 and previous config saved to /var/cache/conftool/dbconfig/20240131-195145-marostegui.json
19:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1222 (T355609)', diff saved to https://phabricator.wikimedia.org/P55996 and previous config saved to /var/cache/conftool/dbconfig/20240131-193927-marostegui.json
19:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1222.eqiad.wmnet with reason: Maintenance
19:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1222.eqiad.wmnet with reason: Maintenance
19:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T355609)', diff saved to https://phabricator.wikimedia.org/P55994 and previous config saved to /var/cache/conftool/dbconfig/20240131-193905-marostegui.json
19:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P55993 and previous config saved to /var/cache/conftool/dbconfig/20240131-192359-marostegui.json
19:17 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.16 refs T354434
19:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P55992 and previous config saved to /var/cache/conftool/dbconfig/20240131-190852-marostegui.json
18:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T355609)', diff saved to https://phabricator.wikimedia.org/P55991 and previous config saved to /var/cache/conftool/dbconfig/20240131-185345-marostegui.json
18:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T355609)', diff saved to https://phabricator.wikimedia.org/P55990 and previous config saved to /var/cache/conftool/dbconfig/20240131-184900-marostegui.json
18:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
18:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
18:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T355609)', diff saved to https://phabricator.wikimedia.org/P55989 and previous config saved to /var/cache/conftool/dbconfig/20240131-184838-marostegui.json
18:40 phuedx@deploy2002: Finished deploy [airflow-dags/analytics@5078a6b]: (no justification provided) (duration: 00m 28s)
18:40 phuedx@deploy2002: Started deploy [airflow-dags/analytics@5078a6b]: (no justification provided)
18:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P55988 and previous config saved to /var/cache/conftool/dbconfig/20240131-183332-marostegui.json
18:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P55986 and previous config saved to /var/cache/conftool/dbconfig/20240131-181825-marostegui.json
18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
18:04 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
18:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T355609)', diff saved to https://phabricator.wikimedia.org/P55985 and previous config saved to /var/cache/conftool/dbconfig/20240131-180319-marostegui.json
17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T355609)', diff saved to https://phabricator.wikimedia.org/P55984 and previous config saved to /var/cache/conftool/dbconfig/20240131-175833-marostegui.json
17:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
17:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T355609)', diff saved to https://phabricator.wikimedia.org/P55983 and previous config saved to /var/cache/conftool/dbconfig/20240131-175811-marostegui.json
17:51 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
17:50 aokoth@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM vrts1001.eqiad.wmnet
17:46 aokoth@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM vrts1001.eqiad.wmnet
17:45 aokoth@cumin1002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM vrts1001.eqiad.wmnet
17:45 aokoth@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM vrts1001.eqiad.wmnet
17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P55982 and previous config saved to /var/cache/conftool/dbconfig/20240131-174305-marostegui.json
17:35 phuedx@deploy2002: Finished deploy [analytics/refinery@bef134c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@bef134c2] (duration: 03m 29s)
17:31 phuedx@deploy2002: Started deploy [analytics/refinery@bef134c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@bef134c2]
17:31 phuedx@deploy2002: Finished deploy [analytics/refinery@bef134c] (thin): Regular analytics weekly train THIN [analytics/refinery@bef134c2] (duration: 00m 08s)
17:30 phuedx@deploy2002: Started deploy [analytics/refinery@bef134c] (thin): Regular analytics weekly train THIN [analytics/refinery@bef134c2]
17:30 phuedx@deploy2002: Finished deploy [analytics/refinery@bef134c]: Regular analytics weekly train [analytics/refinery@bef134c2] (duration: 11m 05s)
17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P55981 and previous config saved to /var/cache/conftool/dbconfig/20240131-172758-marostegui.json
17:19 phuedx@deploy2002: Started deploy [analytics/refinery@bef134c]: Regular analytics weekly train [analytics/refinery@bef134c2]
17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T355609)', diff saved to https://phabricator.wikimedia.org/P55980 and previous config saved to /var/cache/conftool/dbconfig/20240131-171252-marostegui.json
17:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T355609)', diff saved to https://phabricator.wikimedia.org/P55979 and previous config saved to /var/cache/conftool/dbconfig/20240131-170141-marostegui.json
17:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
17:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
17:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55978 and previous config saved to /var/cache/conftool/dbconfig/20240131-170120-marostegui.json
17:01 phuedx@deploy2002: Finished deploy [analytics/refinery@2c00cad] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@2c00cad1] (duration: 03m 35s)
16:57 ejegg: fundraising civicrm upgraded from 520337a0 to 6344c95e
16:57 phuedx@deploy2002: Started deploy [analytics/refinery@2c00cad] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@2c00cad1]
16:56 phuedx@deploy2002: Finished deploy [analytics/refinery@2c00cad] (thin): Regular analytics weekly train THIN [analytics/refinery@2c00cad1] (duration: 00m 06s)
16:56 phuedx@deploy2002: Started deploy [analytics/refinery@2c00cad] (thin): Regular analytics weekly train THIN [analytics/refinery@2c00cad1]
16:54 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
16:52 phuedx@deploy2002: Finished deploy [analytics/refinery@2c00cad]: Regular analytics weekly train [analytics/refinery@2c00cad1] (duration: 09m 52s)
16:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P55977 and previous config saved to /var/cache/conftool/dbconfig/20240131-164613-marostegui.json
16:43 phuedx@deploy2002: Started deploy [analytics/refinery@2c00cad]: Regular analytics weekly train [analytics/refinery@2c00cad1]
16:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P55976 and previous config saved to /var/cache/conftool/dbconfig/20240131-163106-marostegui.json
16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55974 and previous config saved to /var/cache/conftool/dbconfig/20240131-161600-marostegui.json
16:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55973 and previous config saved to /var/cache/conftool/dbconfig/20240131-160624-marostegui.json
16:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
16:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
16:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T355609)', diff saved to https://phabricator.wikimedia.org/P55972 and previous config saved to /var/cache/conftool/dbconfig/20240131-160602-marostegui.json
16:01 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
15:58 moritzm: installing openssh security updates
15:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moscovium.eqiad.wmnet
15:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host moscovium.eqiad.wmnet
15:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P55970 and previous config saved to /var/cache/conftool/dbconfig/20240131-155055-marostegui.json
15:50 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
15:47 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
15:47 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
15:47 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
15:46 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
15:46 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
15:45 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
15:45 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
15:45 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
15:44 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
15:43 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
15:41 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
15:41 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:41 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
15:39 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
15:36 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
15:36 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=maps2009.codfw.wmnet
15:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P55969 and previous config saved to /var/cache/conftool/dbconfig/20240131-153549-marostegui.json
15:34 hnowlan@puppetmaster1001: conftool action : set/weight=10; selector: name=maps1009.eqiad.wmnet
15:32 ayounsi@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
15:29 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1009.eqiad.wmnet
15:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T355609)', diff saved to https://phabricator.wikimedia.org/P55968 and previous config saved to /var/cache/conftool/dbconfig/20240131-152042-marostegui.json
15:18 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
15:17 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
15:17 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
15:16 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
15:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
15:16 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
15:16 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
15:14 btullis@cumin1002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema
15:14 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
15:14 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
15:14 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
15:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T355609)', diff saved to https://phabricator.wikimedia.org/P55967 and previous config saved to /var/cache/conftool/dbconfig/20240131-151016-marostegui.json
15:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
15:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
15:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
15:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
15:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55966 and previous config saved to /var/cache/conftool/dbconfig/20240131-150934-marostegui.json
15:09 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
15:08 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
15:08 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
15:07 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
15:06 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
15:05 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
14:58 btullis@cumin1002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema
14:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P55965 and previous config saved to /var/cache/conftool/dbconfig/20240131-145427-marostegui.json
14:53 brouberol: I'm going to apply kafka log compaction for {eqiad,codfw}.mediawiki.currussearch.page_rerender.v1 on kafka-main-eqiad only (current replica) - T354794
14:52 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
14:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists2001.codfw.wmnet
14:46 urbanecm@deploy2002: Finished scap: Backport for Add WikimediaCampaignEvents to extension list (T347894) (duration: 10m 41s)
14:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host lists2001.codfw.wmnet
14:43 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
14:40 urbanecm@deploy2002: cmelo and urbanecm: Continuing with sync
14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P55964 and previous config saved to /var/cache/conftool/dbconfig/20240131-143921-marostegui.json
14:37 urbanecm@deploy2002: cmelo and urbanecm: Backport for Add WikimediaCampaignEvents to extension list (T347894) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:36 urbanecm@deploy2002: Started scap: Backport for Add WikimediaCampaignEvents to extension list (T347894)
14:30 urbanecm@deploy2002: Finished scap: Backport for [metawiki] Let admins add/remove the event-organizer group (T356070), index.php: Restore support for forcesafemode option. (T355314) (duration: 10m 05s)
14:28 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
14:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55963 and previous config saved to /var/cache/conftool/dbconfig/20240131-142413-marostegui.json
14:23 urbanecm@deploy2002: daimona and matmarex and urbanecm: Continuing with sync
14:21 urbanecm@deploy2002: daimona and matmarex and urbanecm: Backport for [metawiki] Let admins add/remove the event-organizer group (T356070), index.php: Restore support for forcesafemode option. (T355314) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:21 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2020.codfw.wmnet with reason: Decommissioning — T352469
14:20 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2020.codfw.wmnet with reason: Decommissioning — T352469
14:20 urbanecm@deploy2002: Started scap: Backport for [metawiki] Let admins add/remove the event-organizer group (T356070), index.php: Restore support for forcesafemode option. (T355314)
{{safesubst:SAL entry|1=14:19 urbanecm@deploy2002: Finished scap: Backport for decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), Add an exception for ConvenientDiscussions-style permalinks (T349653), [[gerrit:994709|Add an exception for ConvenientDiscussions-style permalinks (T349653)}}
14:18 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript migrateUserGroup.php --wiki=metawiki campaignevents-beta-tester event-organizer # T356070
14:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1146:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55962 and previous config saved to /var/cache/conftool/dbconfig/20240131-141316-marostegui.json
14:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
14:13 urbanecm@deploy2002: urbanecm and kemayo and matmarex and daimona: Continuing with sync
14:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
{{safesubst:SAL entry|1=14:10 urbanecm@deploy2002: urbanecm and kemayo and matmarex and daimona: Backport for decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), Add an exception for ConvenientDiscussions-style permalinks (T349653), [[gerrit:994709|Add an exception for ConvenientDiscuss}}
14:09 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
{{safesubst:SAL entry|1=14:08 urbanecm@deploy2002: Started scap: Backport for decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), decodeURI fragments before sending them to discussiontoolsfindcomment (T356199), Add an exception for ConvenientDiscussions-style permalinks (T349653), [[gerrit:994709|Add an exception for ConvenientDiscussions-style permalinks (T349653)]}}
14:08 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
14:07 urbanecm@deploy2002: Finished scap: Backport for testwiki: Temporarily change default value for 4 Echo properties (T353225) (duration: 19m 37s)
14:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
14:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
14:00 urbanecm@deploy2002: urbanecm: Continuing with sync
13:54 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2003.codfw.wmnet
13:51 urbanecm@deploy2002: urbanecm: Backport for testwiki: Temporarily change default value for 4 Echo properties (T353225) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
13:48 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host people2003.codfw.wmnet
13:48 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host planet1003.eqiad.wmnet
13:48 urbanecm@deploy2002: Started scap: Backport for testwiki: Temporarily change default value for 4 Echo properties (T353225)
13:44 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host planet1003.eqiad.wmnet
13:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T355609)', diff saved to https://phabricator.wikimedia.org/P55960 and previous config saved to /var/cache/conftool/dbconfig/20240131-133143-marostegui.json
13:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
13:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5002.eqsin.wmnet
13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P55959 and previous config saved to /var/cache/conftool/dbconfig/20240131-131637-marostegui.json
13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5002.eqsin.wmnet
13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4002.ulsfo.wmnet
13:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4002.ulsfo.wmnet
13:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3003.esams.wmnet
13:04 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet
13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3003.esams.wmnet
13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
13:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P55957 and previous config saved to /var/cache/conftool/dbconfig/20240131-130130-marostegui.json
12:58 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet
12:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
12:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T355609)', diff saved to https://phabricator.wikimedia.org/P55956 and previous config saved to /var/cache/conftool/dbconfig/20240131-124623-marostegui.json
12:44 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
12:44 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
12:44 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
12:44 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
12:42 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host netmon1003.wikimedia.org
12:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T355609)', diff saved to https://phabricator.wikimedia.org/P55955 and previous config saved to /var/cache/conftool/dbconfig/20240131-123224-marostegui.json
12:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
12:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
12:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T355609)', diff saved to https://phabricator.wikimedia.org/P55954 and previous config saved to /var/cache/conftool/dbconfig/20240131-123203-marostegui.json
12:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
12:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
12:24 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host dbstore1009.eqiad.wmnet
12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
12:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
12:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P55953 and previous config saved to /var/cache/conftool/dbconfig/20240131-121656-marostegui.json
12:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica2008.wikimedia.org
12:13 claime: Raising external traffic to mw-on-k8s to 35% - T355532
12:13 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards2001.codfw.wmnet
12:12 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host dbstore1009.eqiad.wmnet
12:11 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dbstore1008.eqiad.wmnet
12:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica2008.wikimedia.org
12:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica2007.wikimedia.org
12:10 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
12:10 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
12:10 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
12:09 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host stewards2001.codfw.wmnet
12:08 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards1001.eqiad.wmnet
12:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
12:08 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
12:08 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
12:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica2007.wikimedia.org
12:07 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
12:07 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
12:06 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1006.wikimedia.org
12:05 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
12:05 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
12:04 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host stewards1001.eqiad.wmnet
12:04 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
12:04 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
12:03 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
12:03 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host planet2003.codfw.wmnet
12:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1006.wikimedia.org
12:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P55952 and previous config saved to /var/cache/conftool/dbconfig/20240131-120150-marostegui.json
12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1005.wikimedia.org
12:00 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host dbstore1008.eqiad.wmnet
11:59 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host planet2003.codfw.wmnet
11:57 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people1004.eqiad.wmnet
11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1005.wikimedia.org
11:51 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host people1004.eqiad.wmnet
11:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T355609)', diff saved to https://phabricator.wikimedia.org/P55951 and previous config saved to /var/cache/conftool/dbconfig/20240131-114643-marostegui.json
11:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
11:38 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker[1157-1175].eqiad.wmnet
11:38 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1157-1175].eqiad.wmnet
11:37 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker[1157-1175].eqiad.wmnet
11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T355609)', diff saved to https://phabricator.wikimedia.org/P55950 and previous config saved to /var/cache/conftool/dbconfig/20240131-113518-marostegui.json
11:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
11:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55949 and previous config saved to /var/cache/conftool/dbconfig/20240131-113456-marostegui.json
11:34 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet
11:29 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1424.eqiad.wmnet with OS bullseye
11:28 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2006.codfw.wmnet
11:27 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host testvm2006.codfw.wmnet with OS bookworm
11:27 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet
11:26 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1423.eqiad.wmnet with OS bullseye
11:24 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1425.eqiad.wmnet with OS bullseye
11:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P55948 and previous config saved to /var/cache/conftool/dbconfig/20240131-111949-marostegui.json
11:11 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1424.eqiad.wmnet with reason: host reimage
11:08 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1423.eqiad.wmnet with reason: host reimage
11:05 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1425.eqiad.wmnet with reason: host reimage
11:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P55947 and previous config saved to /var/cache/conftool/dbconfig/20240131-110442-marostegui.json
11:02 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1424.eqiad.wmnet with reason: host reimage
11:02 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1423.eqiad.wmnet with reason: host reimage
11:01 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1425.eqiad.wmnet with reason: host reimage
10:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
10:53 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
10:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
10:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55946 and previous config saved to /var/cache/conftool/dbconfig/20240131-104936-marostegui.json
10:49 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
10:48 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1424.eqiad.wmnet with OS bullseye
10:48 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1423.eqiad.wmnet with OS bullseye
10:48 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw1425.eqiad.wmnet with OS bullseye
10:46 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
10:43 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
10:42 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
10:41 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
10:41 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55945 and previous config saved to /var/cache/conftool/dbconfig/20240131-103830-marostegui.json
10:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
10:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T355609)', diff saved to https://phabricator.wikimedia.org/P55944 and previous config saved to /var/cache/conftool/dbconfig/20240131-103807-marostegui.json
10:36 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1157.eqiad.wmnet
10:35 btullis@deploy2002: Finished deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c] (duration: 00m 07s)
10:35 btullis@deploy2002: Started deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c]
10:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
10:33 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
10:30 btullis@deploy2002: Finished deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c] (duration: 00m 05s)
10:30 btullis@deploy2002: Started deploy [analytics/refinery@13f7a06] (hadoop-test): Ad-hoc deploy of refinery TEST for T354703 [analytics/refinery@13f7a06c]
10:30 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
10:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1002.eqiad.wmnet
10:29 stevemunene@cumin1002: START - Cookbook sre.hosts.reboot-single for host an-worker1157.eqiad.wmnet
10:25 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
10:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1002.eqiad.wmnet
10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P55943 and previous config saved to /var/cache/conftool/dbconfig/20240131-102300-marostegui.json
10:21 cgoubert@cumin2002: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
10:20 cgoubert@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host testreduce1002.eqiad.wmnet
10:20 cgoubert@cumin2002: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
10:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1001.eqiad.wmnet
10:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P55942 and previous config saved to /var/cache/conftool/dbconfig/20240131-100754-marostegui.json
10:03 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1001.eqiad.wmnet
10:02 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
10:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
09:53 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
09:53 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host moss-be2003.codfw.wmnet
09:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T355609)', diff saved to https://phabricator.wikimedia.org/P55941 and previous config saved to /var/cache/conftool/dbconfig/20240131-095247-marostegui.json
09:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
09:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
09:51 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
09:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
09:50 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
09:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
09:49 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - ayounsi@cumin1002"
09:47 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
09:47 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T355609)', diff saved to https://phabricator.wikimedia.org/P55940 and previous config saved to /var/cache/conftool/dbconfig/20240131-094301-marostegui.json
09:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
09:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
09:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55939 and previous config saved to /var/cache/conftool/dbconfig/20240131-094239-marostegui.json
09:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5004.wikimedia.org
09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P55938 and previous config saved to /var/cache/conftool/dbconfig/20240131-092733-marostegui.json
09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5004.wikimedia.org
09:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4005.wikimedia.org
09:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast4005.wikimedia.org
09:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P55937 and previous config saved to /var/cache/conftool/dbconfig/20240131-091226-marostegui.json
09:08 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet
09:07 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host sretest1003.eqiad.wmnet
09:01 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet
08:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55936 and previous config saved to /var/cache/conftool/dbconfig/20240131-085719-marostegui.json
08:55 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet
08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet
08:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet
08:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3312 (T355609)', diff saved to https://phabricator.wikimedia.org/P55935 and previous config saved to /var/cache/conftool/dbconfig/20240131-084700-marostegui.json
08:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
08:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
08:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T355609)', diff saved to https://phabricator.wikimedia.org/P55934 and previous config saved to /var/cache/conftool/dbconfig/20240131-084637-marostegui.json
08:45 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet
08:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet
08:44 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet
08:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host crm2001.codfw.wmnet
08:40 filippo@cumin1002: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet
08:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host crm2001.codfw.wmnet
08:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 100%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55932 and previous config saved to /var/cache/conftool/dbconfig/20240131-083142-root.json
08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P55931 and previous config saved to /var/cache/conftool/dbconfig/20240131-083130-marostegui.json
08:27 moritzm: installing systemd bugfix updates from bookworm 12.4 point release
08:21 moritzm: installing systemd bugfix updates from bookworm 12.4 point release
08:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2004.codfw.wmnet
08:18 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm1001.wikimedia.org
08:17 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
08:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 75%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55930 and previous config saved to /var/cache/conftool/dbconfig/20240131-081637-root.json
08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2004.codfw.wmnet
08:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P55929 and previous config saved to /var/cache/conftool/dbconfig/20240131-081624-marostegui.json
08:14 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idm1001.wikimedia.org
08:13 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm2001.wikimedia.org
08:13 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
08:09 moritzm: installing ca-certificates-java bugfix updates from bookworm 12.4 point release
08:09 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idm2001.wikimedia.org
08:09 slyngshede@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm-test1001.wikimedia.org
08:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
08:05 slyngshede@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM idm-test1001.wikimedia.org
08:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
08:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 50%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55928 and previous config saved to /var/cache/conftool/dbconfig/20240131-080132-root.json
08:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T355609)', diff saved to https://phabricator.wikimedia.org/P55927 and previous config saved to /var/cache/conftool/dbconfig/20240131-080117-marostegui.json
07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T355609)', diff saved to https://phabricator.wikimedia.org/P55926 and previous config saved to /var/cache/conftool/dbconfig/20240131-075600-marostegui.json
07:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
07:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
07:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
07:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T355609)', diff saved to https://phabricator.wikimedia.org/P55925 and previous config saved to /var/cache/conftool/dbconfig/20240131-075522-marostegui.json
07:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
07:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
07:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 25%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55924 and previous config saved to /var/cache/conftool/dbconfig/20240131-074627-root.json
07:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
07:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
07:42 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster tap - ayounsi@cumin1002"
07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P55923 and previous config saved to /var/cache/conftool/dbconfig/20240131-074015-marostegui.json
07:39 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
07:38 ayounsi@cumin1002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
07:38 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
07:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 10%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55922 and previous config saved to /var/cache/conftool/dbconfig/20240131-073121-root.json
07:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet
07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P55921 and previous config saved to /var/cache/conftool/dbconfig/20240131-072509-marostegui.json
07:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet
07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55920 and previous config saved to /var/cache/conftool/dbconfig/20240131-072129-root.json
07:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS bookworm
07:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 5%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55919 and previous config saved to /var/cache/conftool/dbconfig/20240131-071616-root.json
07:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T355609)', diff saved to https://phabricator.wikimedia.org/P55918 and previous config saved to /var/cache/conftool/dbconfig/20240131-071002-marostegui.json
07:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55917 and previous config saved to /var/cache/conftool/dbconfig/20240131-070624-root.json
07:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 1%: After Bookworm upgrade T354506', diff saved to https://phabricator.wikimedia.org/P55916 and previous config saved to /var/cache/conftool/dbconfig/20240131-070111-root.json
06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T355609)', diff saved to https://phabricator.wikimedia.org/P55915 and previous config saved to /var/cache/conftool/dbconfig/20240131-065922-marostegui.json
06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107 (T355609)', diff saved to https://phabricator.wikimedia.org/P55914 and previous config saved to /var/cache/conftool/dbconfig/20240131-065901-marostegui.json
06:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2114.codfw.wmnet with OS bookworm
06:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
06:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55913 and previous config saved to /var/cache/conftool/dbconfig/20240131-065118-root.json
06:47 moritzm: installing glibc security updates on bookworm
06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107', diff saved to https://phabricator.wikimedia.org/P55912 and previous config saved to /var/cache/conftool/dbconfig/20240131-064353-marostegui.json
06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2114.codfw.wmnet with reason: host reimage
06:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2114.codfw.wmnet with reason: host reimage
06:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55911 and previous config saved to /var/cache/conftool/dbconfig/20240131-063613-root.json
06:35 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS bookworm
06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107', diff saved to https://phabricator.wikimedia.org/P55910 and previous config saved to /var/cache/conftool/dbconfig/20240131-062846-marostegui.json
06:22 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2114.codfw.wmnet with OS bookworm
06:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55909 and previous config saved to /var/cache/conftool/dbconfig/20240131-062109-root.json
06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2114 T354506', diff saved to https://phabricator.wikimedia.org/P55908 and previous config saved to /var/cache/conftool/dbconfig/20240131-061932-root.json
06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2107 (T355609)', diff saved to https://phabricator.wikimedia.org/P55907 and previous config saved to /var/cache/conftool/dbconfig/20240131-061340-marostegui.json
06:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55906 and previous config saved to /var/cache/conftool/dbconfig/20240131-060602-root.json
06:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2107 (T355609)', diff saved to https://phabricator.wikimedia.org/P55905 and previous config saved to /var/cache/conftool/dbconfig/20240131-060337-marostegui.json
06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2107.codfw.wmnet with reason: Maintenance
06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2107.codfw.wmnet with reason: Maintenance
05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
05:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: After onsite maintenance', diff saved to https://phabricator.wikimedia.org/P55904 and previous config saved to /var/cache/conftool/dbconfig/20240131-055057-root.json
05:41 eileen: civicrm upgraded from 6de61520 to 520337a0
05:30 fab@deploy2002: Finished deploy [airflow-dags/research@97c6a4e]: (no justification provided) (duration: 00m 14s)
05:30 fab@deploy2002: Started deploy [airflow-dags/research@97c6a4e]: (no justification provided)
03:29 eileen: tools upgraded from 02281338 to c823e692
03:05 fab@deploy2002: Finished deploy [airflow-dags/research@6a97a34]: (no justification provided) (duration: 00m 23s)
03:05 fab@deploy2002: Started deploy [airflow-dags/research@6a97a34]: (no justification provided)

2024-01-30

23:54 mutante: LDAP - added aklapper to group releng T356043
23:07 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1006.eqiad.wmnet
23:07 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for sessionstore1006.eqiad.wmnet
22:49 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore1006.eqiad.wmnet with reason: Bootstrapping — T353402
22:48 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore1006.eqiad.wmnet with reason: Bootstrapping — T353402
22:41 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
22:20 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1005.eqiad.wmnet
22:20 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for sessionstore1005.eqiad.wmnet
22:10 cjming: end of UTC late backport window
22:09 cjming@deploy2002: Finished scap: Backport for [eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033) (duration: 08m 24s)
22:02 cjming@deploy2002: cjming and superpes: Continuing with sync
22:02 cjming@deploy2002: cjming and superpes: Backport for [eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
22:00 cjming@deploy2002: Started scap: Backport for [eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033)
21:59 cjming@deploy2002: Finished scap: Backport for [ukwiki] Change autoconfirmed setting (T355972), [ganwiki] Add 'suppressredirect' to transwiki usergroup and change assignment and revocation methods (T354850), [ganwiki] Add new namespace aliases (T355854) (duration: 09m 32s)
21:53 cjming@deploy2002: superpes and cjming: Continuing with sync
21:51 cjming@deploy2002: superpes and cjming: Backport for [ukwiki] Change autoconfirmed setting (T355972), [ganwiki] Add 'suppressredirect' to transwiki usergroup and change assignment and revocation methods (T354850), [ganwiki] Add new namespace aliases (T355854) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:50 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore1005.eqiad.wmnet with reason: Bootstrapping — T353402
21:50 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore1005.eqiad.wmnet with reason: Bootstrapping — T353402
21:49 cjming@deploy2002: Started scap: Backport for [ukwiki] Change autoconfirmed setting (T355972), [ganwiki] Add 'suppressredirect' to transwiki usergroup and change assignment and revocation methods (T354850), [ganwiki] Add new namespace aliases (T355854)
21:44 cjming@deploy2002: Finished scap: Backport for Run CheckerJob against read-only clusters (T354793) (duration: 07m 41s)
21:42 mutante: LDAP - added jnuche to group releng (T356043) - already done/approved in the past in T301149
21:41 mutante: LDAP - added jhuneidi to group releng (T356043) - already done/approved in the past in T210028
21:40 mutante: LDAP - added brennen to group releng (T356043) - already done/approved in the past in T215365
21:38 cjming@deploy2002: cjming and ebernhardson: Continuing with sync
21:38 cjming@deploy2002: cjming and ebernhardson: Backport for Run CheckerJob against read-only clusters (T354793) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:37 cjming@deploy2002: Started scap: Backport for Run CheckerJob against read-only clusters (T354793)
21:36 cjming@deploy2002: Finished scap: Backport for Run CheckerJob against read-only clusters (T354793) (duration: 07m 49s)
21:34 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
21:30 cjming@deploy2002: ebernhardson and cjming: Continuing with sync
21:30 cjming@deploy2002: ebernhardson and cjming: Backport for Run CheckerJob against read-only clusters (T354793) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:28 cjming@deploy2002: Started scap: Backport for Run CheckerJob against read-only clusters (T354793)
21:01 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1004.eqiad.wmnet
21:01 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for sessionstore1004.eqiad.wmnet
20:52 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
20:51 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate first private IP host config - bking@cumin2002 - T355617
20:38 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on sessionstore1004.eqiad.wmnet with reason: Commissioning — T353402
20:38 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on sessionstore1004.eqiad.wmnet with reason: Commissioning — T353402
20:35 urandom: bootstrapping sessionstore1004/cassandra-a — T353402
20:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wdqs::public
19:45 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wdqs::public
19:36 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in cloudelastic
19:36 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in cloudelastic
19:36 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010.eqiad.wmnet for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
19:36 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010.eqiad.wmnet for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
19:27 Lucas_WMDE: FINISHED lucaswerkmeister-wmde@mwmaint2002:~$ mwscript CheckSignatures enwiki | tee T356168 # -- 268378 invalid signatures --
19:10 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.16 refs T354434
19:09 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
18:52 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
18:52 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
18:46 xcollazo@deploy2002: Finished deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided) (duration: 00m 05s)
18:46 xcollazo@deploy2002: Started deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided)
18:17 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
18:16 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
18:05 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
18:04 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
18:04 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
18:04 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
18:04 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
18:03 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
18:03 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
18:03 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
18:02 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
18:02 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
18:02 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
18:02 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
17:37 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
17:37 urandom: DROP test_spark3_loading keyspace, Generated Data (Cassandra) cluster — T356112
17:22 jforrester@deploy2002: Finished scap: Backport for Do not search for elements if no previews have been registered (T355933 T356186 T356193), Do not search for elements if no previews have been registered (T355933 T356186 T356193) (duration: 11m 51s)
17:21 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
17:15 jforrester@deploy2002: jforrester: Continuing with sync
17:14 jforrester@deploy2002: jforrester: Backport for Do not search for elements if no previews have been registered (T355933 T356186 T356193), Do not search for elements if no previews have been registered (T355933 T356186 T356193) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
17:13 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2005.codfw.wmnet with OS bookworm
17:10 jforrester@deploy2002: Started scap: Backport for Do not search for elements if no previews have been registered (T355933 T356186 T356193), Do not search for elements if no previews have been registered (T355933 T356186 T356193)
16:57 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
16:56 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1009.wikimedia.org
16:56 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1008.wikimedia.org
16:56 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1007.wikimedia.org
16:54 claime: Running homer 'cr*codfw*' commit 'T351074'
16:54 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: sync
16:54 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: sync
16:49 mutante: gitlab is back
16:48 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
16:47 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
16:47 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
16:47 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
16:44 mutante: gitlab is down for maintenance for a few minutes
16:34 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
16:29 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on gitlab.wikimedia.org with reason: server move
16:29 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on gitlab.wikimedia.org with reason: server move
16:28 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on gitlab2002.wikimedia.org with reason: server move
16:28 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on gitlab2002.wikimedia.org with reason: server move
16:25 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1466.eqiad.wmnet with OS bullseye
16:21 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1457.eqiad.wmnet with OS bullseye
16:18 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2366.codfw.wmnet with OS bullseye
16:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1440.eqiad.wmnet with OS bullseye
16:14 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
16:13 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1008.wikimedia.org
16:13 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2370.codfw.wmnet with OS bullseye
16:11 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
16:09 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1482.eqiad.wmnet with OS bullseye
16:08 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2368.codfw.wmnet with OS bullseye
16:06 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1466.eqiad.wmnet with reason: host reimage
16:03 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1459.eqiad.wmnet with OS bullseye
16:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1457.eqiad.wmnet with reason: host reimage
15:59 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2366.codfw.wmnet with reason: host reimage
15:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
15:58 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cloudelastic1010.eqiad.wmnet with reason: T355617
15:56 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1440.eqiad.wmnet with reason: host reimage
15:54 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
15:53 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2370.codfw.wmnet with reason: host reimage
15:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1482.eqiad.wmnet with reason: host reimage
15:47 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2368.codfw.wmnet with reason: host reimage
15:44 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1459.eqiad.wmnet with reason: host reimage
15:42 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2370.codfw.wmnet with reason: host reimage
15:42 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1457.eqiad.wmnet with reason: host reimage
15:42 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1466.eqiad.wmnet with reason: host reimage
15:42 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2366.codfw.wmnet with reason: host reimage
15:42 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1440.eqiad.wmnet with reason: host reimage
15:41 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2368.codfw.wmnet with reason: host reimage
15:41 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1482.eqiad.wmnet with reason: host reimage
15:41 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1459.eqiad.wmnet with reason: host reimage
15:40 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
15:29 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript CheckSignatures enwiki | tee T356168
15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1466.eqiad.wmnet with OS bullseye
15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1459.eqiad.wmnet with OS bullseye
15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1482.eqiad.wmnet with OS bullseye
15:28 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1457.eqiad.wmnet with OS bullseye
15:27 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1440.eqiad.wmnet with OS bullseye
15:26 Lucas_WMDE: UTC afternoon backport+config window done
15:26 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2370.codfw.wmnet with OS bullseye
15:25 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2368.codfw.wmnet with OS bullseye
15:25 cgoubert@cumin2002: START - Cookbook sre.hosts.reimage for host mw2366.codfw.wmnet with OS bullseye
15:17 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes enwikiquote --fix # T355195 (two pages will need separate fixing)
15:17 claime: Recomissioning mw2366.codfw.wmnet,mw2368.codfw.wmnet,mw2370.codfw.wmnet as k8s nodes - T351074
15:17 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host sretest2005.codfw.wmnet
15:17 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
15:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [enwikiquote] Add a draft namespace and its talk space (T355195) (duration: 08m 43s)
15:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Continuing with sync
15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Backport for [enwikiquote] Add a draft namespace and its talk space (T355195) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [enwikiquote] Add a draft namespace and its talk space (T355195)
15:06 claime: Manual run of mediawiki_job_generatecaptcha.service following timer failure - T141490
15:06 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes enwiktionary --fix # T354813
15:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [enwiktionary] Remove the Concordance namespace and its talk space (T354813) (duration: 09m 57s)
14:59 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
14:57 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [enwiktionary] Remove the Concordance namespace and its talk space (T354813) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:55 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [enwiktionary] Remove the Concordance namespace and its talk space (T354813)
14:52 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes azwiki --fix # T355041, failed at the end :(
14:52 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [azwiki] Changing 9 namespace aliases (T355041) (duration: 08m 37s)
14:46 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
14:45 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [azwiki] Changing 9 namespace aliases (T355041) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:43 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [azwiki] Changing 9 namespace aliases (T355041)
14:41 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for CommentParser: Ignore generated timestamp links (T356142), CommentParser: Ignore generated timestamp links (T356142), Add maintenance script to list users with invalid signatures (T356168) (duration: 11m 01s)
14:40 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
14:35 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Continuing with sync
14:32 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
14:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Backport for CommentParser: Ignore generated timestamp links (T356142), CommentParser: Ignore generated timestamp links (T356142), Add maintenance script to list users with invalid signatures (T356168) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:31 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
14:31 gmodena@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for CommentParser: Ignore generated timestamp links (T356142), CommentParser: Ignore generated timestamp links (T356142), Add maintenance script to list users with invalid signatures (T356168)
14:30 gmodena@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
14:30 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
14:26 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
14:26 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 backport Cancelled
14:18 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Don't bail out early when there are no selectors configured (T355933) (duration: 09m 04s)
14:12 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Continuing with sync
14:11 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Backport for Don't bail out early when there are no selectors configured (T355933) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:11 volans@cumin2002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
14:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Don't bail out early when there are no selectors configured (T355933)
14:09 volans@cumin2002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
13:56 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
13:55 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
13:55 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
13:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2005.codfw.wmnet on all recursors
13:54 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2005.codfw.wmnet on all recursors
13:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:54 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
13:53 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest2005.codfw.wmnet - ayounsi@cumin1002"
13:47 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
13:47 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host sretest2005.codfw.wmnet
13:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts srestest2005.codfw.wmnet
13:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: srestest2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
13:44 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: srestest2005.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
13:39 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
13:37 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1157-1175].eqiad.wmnet
13:36 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts srestest2005.codfw.wmnet
13:34 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=94) for new host srestest2005.codfw.wmnet
13:33 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:33 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
13:32 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
13:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:31 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:26 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
13:26 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
13:16 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=93) for new host srestest2005.codfw.wmnet
13:16 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
13:16 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
13:16 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:16 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:15 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:12 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
13:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
13:12 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
13:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:10 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
13:08 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker[1159-1175].eqiad.wmnet
13:08 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1159-1175].eqiad.wmnet
13:08 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
13:08 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
13:06 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker1158.eqiad.wmnet
13:04 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1158.eqiad.wmnet
12:19 taavi: reprepro import exim4 4.96-15+deb12u4+wmf1 to component/exim4-arc T356171
11:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T343718)', diff saved to https://phabricator.wikimedia.org/P55896 and previous config saved to /var/cache/conftool/dbconfig/20240130-114726-ladsgroup.json
11:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1005.eqiad.wmnet
11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P55895 and previous config saved to /var/cache/conftool/dbconfig/20240130-113220-ladsgroup.json
11:30 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=99) for hosts an-worker1157.eqiad.wmnet
11:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-airflow1005.eqiad.wmnet
11:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
11:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
11:19 stevemunene@cumin1002: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1157.eqiad.wmnet
11:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P55894 and previous config saved to /var/cache/conftool/dbconfig/20240130-111713-ladsgroup.json
11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::search
11:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
11:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
11:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::search
11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T343718)', diff saved to https://phabricator.wikimedia.org/P55893 and previous config saved to /var/cache/conftool/dbconfig/20240130-110207-ladsgroup.json
10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 (T343718)', diff saved to https://phabricator.wikimedia.org/P55892 and previous config saved to /var/cache/conftool/dbconfig/20240130-105954-ladsgroup.json
10:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
10:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
10:56 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1005.eqiad.wmnet with OS bullseye
10:56 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
10:45 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
10:35 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=93) for new host srestest2005.codfw.wmnet
10:35 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
10:35 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
10:35 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:35 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
10:34 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
10:34 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
10:32 volans@cumin1002: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox-canary
10:32 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1005.eqiad.wmnet with reason: host reimage
10:31 volans@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
10:31 volans@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
10:31 volans@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
10:29 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
10:29 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) srestest2005.codfw.wmnet on all recursors
10:29 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache srestest2005.codfw.wmnet on all recursors
10:29 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:28 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
10:28 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM srestest2005.codfw.wmnet - ayounsi@cumin1002"
10:26 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1005.eqiad.wmnet with reason: host reimage
10:26 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
10:25 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
10:24 ayounsi@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host srestest2005.codfw.wmnet
10:24 ayounsi@cumin1002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
10:23 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
10:23 ayounsi@cumin1002: START - Cookbook sre.ganeti.makevm for new host srestest2005.codfw.wmnet
10:23 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
10:16 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-airflow1005.eqiad.wmnet with OS bullseye
10:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host phab1004.eqiad.wmnet
10:00 gmodena@deploy2002: Finished deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided) (duration: 00m 37s)
10:00 gmodena@deploy2002: Started deploy [airflow-dags/analytics@ccaa5dc]: (no justification provided)
09:56 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host phab1004.eqiad.wmnet
09:30 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-tool1008.eqiad.wmnet with OS bullseye
09:14 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1008.eqiad.wmnet with reason: host reimage
09:11 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1008.eqiad.wmnet with reason: host reimage
09:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 100%: Switchover', diff saved to https://phabricator.wikimedia.org/P55891 and previous config saved to /var/cache/conftool/dbconfig/20240130-090704-root.json
09:00 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-tool1008.eqiad.wmnet with OS bullseye
08:57 Emperor: restart swift-object-replicator on ms-be1068
08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 75%: Switchover', diff saved to https://phabricator.wikimedia.org/P55890 and previous config saved to /var/cache/conftool/dbconfig/20240130-085159-root.json
08:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55889 and previous config saved to /var/cache/conftool/dbconfig/20240130-085055-root.json
08:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55888 and previous config saved to /var/cache/conftool/dbconfig/20240130-083829-root.json
08:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 50%: Switchover', diff saved to https://phabricator.wikimedia.org/P55887 and previous config saved to /var/cache/conftool/dbconfig/20240130-083654-root.json
08:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55886 and previous config saved to /var/cache/conftool/dbconfig/20240130-083550-root.json
08:29 moritzm: upgrading python-pymysql on remaining DB hosts to 1.0.2-2~wmf11u1 T355531
08:28 ladsgroup@deploy2002: Finished scap: Backport for Enable PageNotice extension on testwiki (T61245) (duration: 10m 24s)
08:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55885 and previous config saved to /var/cache/conftool/dbconfig/20240130-082324-root.json
08:22 ladsgroup@deploy2002: ladsgroup and tto: Continuing with sync
08:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 25%: Switchover', diff saved to https://phabricator.wikimedia.org/P55884 and previous config saved to /var/cache/conftool/dbconfig/20240130-082149-root.json
08:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55883 and previous config saved to /var/cache/conftool/dbconfig/20240130-082045-root.json
08:19 ladsgroup@deploy2002: ladsgroup and tto: Backport for Enable PageNotice extension on testwiki (T61245) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:18 ladsgroup@deploy2002: Started scap: Backport for Enable PageNotice extension on testwiki (T61245)
08:08 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55882 and previous config saved to /var/cache/conftool/dbconfig/20240130-080819-root.json
08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2114 (re)pooling @ 10%: Switchover', diff saved to https://phabricator.wikimedia.org/P55881 and previous config saved to /var/cache/conftool/dbconfig/20240130-080644-root.json
08:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55880 and previous config saved to /var/cache/conftool/dbconfig/20240130-080540-root.json
07:55 ayounsi@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2034.codfw.wmnet to cluster codfw02 and group AB
07:53 ayounsi@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2034.codfw.wmnet to cluster codfw02 and group AB
07:53 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55879 and previous config saved to /var/cache/conftool/dbconfig/20240130-075314-root.json
07:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55878 and previous config saved to /var/cache/conftool/dbconfig/20240130-075035-root.json
07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2105 T356069', diff saved to https://phabricator.wikimedia.org/P55877 and previous config saved to /var/cache/conftool/dbconfig/20240130-074746-root.json
07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2127 to s3 primary and set section read-write T356069', diff saved to https://phabricator.wikimedia.org/P55876 and previous config saved to /var/cache/conftool/dbconfig/20240130-074656-marostegui.json
07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Set s3 codfw as read-only for maintenance - T356069', diff saved to https://phabricator.wikimedia.org/P55875 and previous config saved to /var/cache/conftool/dbconfig/20240130-074634-marostegui.json
07:46 marostegui: Starting s3 codfw failover from db2105 to db2127 - T356069
07:38 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55874 and previous config saved to /var/cache/conftool/dbconfig/20240130-073807-root.json
07:33 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s3 T356069
07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2127 with weight 0 T356069', diff saved to https://phabricator.wikimedia.org/P55873 and previous config saved to /var/cache/conftool/dbconfig/20240130-073257-marostegui.json
07:32 root@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 23 hosts with reason: Primary switchover s3 T356069
07:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55872 and previous config saved to /var/cache/conftool/dbconfig/20240130-072734-root.json
07:23 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P55871 and previous config saved to /var/cache/conftool/dbconfig/20240130-072302-root.json
07:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55870 and previous config saved to /var/cache/conftool/dbconfig/20240130-071612-root.json
07:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55869 and previous config saved to /var/cache/conftool/dbconfig/20240130-071229-root.json
07:12 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2144 to x2 master T356060', diff saved to https://phabricator.wikimedia.org/P55868 and previous config saved to /var/cache/conftool/dbconfig/20240130-071202-root.json
07:07 marostegui@cumin1002: dbctl commit (dc=all): 'es2020 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P55867 and previous config saved to /var/cache/conftool/dbconfig/20240130-070757-root.json
07:02 marostegui@deploy2002: Finished scap: Backport for Revert "db-production.php: Disable writes on es4" (duration: 07m 48s)
07:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55866 and previous config saved to /var/cache/conftool/dbconfig/20240130-070107-root.json
07:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover x2 T356060
07:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover x2 T356060
06:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55865 and previous config saved to /var/cache/conftool/dbconfig/20240130-065724-root.json
06:55 marostegui@deploy2002: marostegui: Continuing with sync
06:55 marostegui@deploy2002: marostegui: Backport for Revert "db-production.php: Disable writes on es4" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
06:54 marostegui@deploy2002: Started scap: Backport for Revert "db-production.php: Disable writes on es4"
06:48 marostegui@deploy2002: backport Cancelled
06:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55864 and previous config saved to /var/cache/conftool/dbconfig/20240130-064602-root.json
06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2020 T356064', diff saved to https://phabricator.wikimedia.org/P55863 and previous config saved to /var/cache/conftool/dbconfig/20240130-064526-root.json
06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Reduce es2021 weight T356064', diff saved to https://phabricator.wikimedia.org/P55862 and previous config saved to /var/cache/conftool/dbconfig/20240130-064512-root.json
06:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55861 and previous config saved to /var/cache/conftool/dbconfig/20240130-064219-root.json
06:36 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2021 to es4 primary T356064', diff saved to https://phabricator.wikimedia.org/P55860 and previous config saved to /var/cache/conftool/dbconfig/20240130-063625-root.json
06:35 marostegui: Starting es4 codfw failover from es2020 to es2021 - T356064
06:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55859 and previous config saved to /var/cache/conftool/dbconfig/20240130-063057-root.json
06:30 marostegui@deploy2002: Finished scap: Backport for db-production.php: Disable writes on es4 (T356064) (duration: 09m 11s)
06:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1224 T354591', diff saved to https://phabricator.wikimedia.org/P55858 and previous config saved to /var/cache/conftool/dbconfig/20240130-062930-root.json
06:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55857 and previous config saved to /var/cache/conftool/dbconfig/20240130-062714-root.json
06:23 marostegui@deploy2002: marostegui: Continuing with sync
06:22 marostegui@deploy2002: marostegui: Backport for db-production.php: Disable writes on es4 (T356064) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
06:22 marostegui@cumin1002: dbctl commit (dc=all): 'Set es2020 with weight 0 T356064', diff saved to https://phabricator.wikimedia.org/P55856 and previous config saved to /var/cache/conftool/dbconfig/20240130-062241-marostegui.json
06:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
06:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
06:21 marostegui@deploy2002: Started scap: Backport for db-production.php: Disable writes on es4 (T356064)
06:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
06:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356064
06:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2103 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55855 and previous config saved to /var/cache/conftool/dbconfig/20240130-061552-root.json
06:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2103 T356059', diff saved to https://phabricator.wikimedia.org/P55854 and previous config saved to /var/cache/conftool/dbconfig/20240130-061529-root.json
06:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P55853 and previous config saved to /var/cache/conftool/dbconfig/20240130-061423-root.json
06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2112 to s1 primary and set section read-write T356059', diff saved to https://phabricator.wikimedia.org/P55852 and previous config saved to /var/cache/conftool/dbconfig/20240130-061305-marostegui.json
06:12 marostegui@cumin1002: dbctl commit (dc=all): 'Set s1 codfw as read-only for maintenance - T356059', diff saved to https://phabricator.wikimedia.org/P55851 and previous config saved to /var/cache/conftool/dbconfig/20240130-061243-marostegui.json
06:12 marostegui: Starting s1 codfw failover from db2103 to db2112 - T356059
06:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55850 and previous config saved to /var/cache/conftool/dbconfig/20240130-061014-root.json
06:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P55849 and previous config saved to /var/cache/conftool/dbconfig/20240130-060727-root.json
05:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 36 hosts with reason: Primary switchover s1 T356059
05:44 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2112 with weight 0 T356059', diff saved to https://phabricator.wikimedia.org/P55848 and previous config saved to /var/cache/conftool/dbconfig/20240130-054410-marostegui.json
05:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 36 hosts with reason: Primary switchover s1 T356059
05:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2114 T355739', diff saved to https://phabricator.wikimedia.org/P55847 and previous config saved to /var/cache/conftool/dbconfig/20240130-054154-root.json
05:40 marostegui@cumin1002: dbctl commit (dc=all): 'Set s6 codfw as read-only for maintenance - T355739', diff saved to https://phabricator.wikimedia.org/P55845 and previous config saved to /var/cache/conftool/dbconfig/20240130-054025-root.json
05:40 marostegui: Starting s6 codfw failover from db2114 to db2129 - T355739
05:19 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2129 with weight 0 T355739', diff saved to https://phabricator.wikimedia.org/P55844 and previous config saved to /var/cache/conftool/dbconfig/20240130-051952-marostegui.json
05:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355739
05:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355739
04:57 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.16 refs T354434 (duration: 52m 38s)
04:04 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.16 refs T354434
04:02 mwpresync@deploy2002: Pruned MediaWiki: 1.42.0-wmf.13 (duration: 02m 09s)
03:30 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
03:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
00:00 eileen: tools upgraded from 117e1f9c to 544301bd

2024-01-29

22:31 catrope@deploy2002: Finished scap: Backport for Drop English Wikipedia configuration for wgMFUseDesktopSpecialHistoryPage (T353388) (duration: 28m 33s)
22:24 catrope@deploy2002: catrope and jdlrobson: Continuing with sync
22:03 catrope@deploy2002: catrope and jdlrobson: Backport for Drop English Wikipedia configuration for wgMFUseDesktopSpecialHistoryPage (T353388) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
22:02 catrope@deploy2002: Started scap: Backport for Drop English Wikipedia configuration for wgMFUseDesktopSpecialHistoryPage (T353388)
21:54 catrope@deploy2002: Finished scap: Backport for Use desktop history page HTML everywhere (T353388), Begin capturing errors for Wikivoyage (duration: 12m 05s)
21:48 catrope@deploy2002: catrope and jdlrobson: Continuing with sync
21:43 catrope@deploy2002: catrope and jdlrobson: Backport for Use desktop history page HTML everywhere (T353388), Begin capturing errors for Wikivoyage synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:42 catrope@deploy2002: Started scap: Backport for Use desktop history page HTML everywhere (T353388), Begin capturing errors for Wikivoyage
21:36 catrope@deploy2002: Finished scap: Backport for DiscussionTools: Enable permalinks frontend everywhere except en.wiki (T356063) (duration: 12m 19s)
21:30 catrope@deploy2002: catrope and esanders: Continuing with sync
21:25 catrope@deploy2002: catrope and esanders: Backport for DiscussionTools: Enable permalinks frontend everywhere except en.wiki (T356063) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:24 catrope@deploy2002: Started scap: Backport for DiscussionTools: Enable permalinks frontend everywhere except en.wiki (T356063)
21:17 catrope@deploy2002: Finished scap: Backport for cirrus: Disable cloudelastic writes to testwiki and mw.org (T352335) (duration: 08m 40s)
21:11 catrope@deploy2002: ebernhardson and catrope: Continuing with sync
21:10 catrope@deploy2002: ebernhardson and catrope: Backport for cirrus: Disable cloudelastic writes to testwiki and mw.org (T352335) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:09 catrope@deploy2002: Started scap: Backport for cirrus: Disable cloudelastic writes to testwiki and mw.org (T352335)
20:37 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:37 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
20:33 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
20:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T355609)', diff saved to https://phabricator.wikimedia.org/P55843 and previous config saved to /var/cache/conftool/dbconfig/20240129-202740-marostegui.json
20:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P55842 and previous config saved to /var/cache/conftool/dbconfig/20240129-201233-marostegui.json
19:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P55841 and previous config saved to /var/cache/conftool/dbconfig/20240129-195725-marostegui.json
19:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T355609)', diff saved to https://phabricator.wikimedia.org/P55840 and previous config saved to /var/cache/conftool/dbconfig/20240129-194218-marostegui.json
19:36 zabe@deploy2002: Finished scap: Backport for Start reading from af_actor/afh_actor everywhere (T355616) (duration: 09m 09s)
19:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T355609)', diff saved to https://phabricator.wikimedia.org/P55839 and previous config saved to /var/cache/conftool/dbconfig/20240129-193317-marostegui.json
19:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2193.codfw.wmnet with reason: Maintenance
19:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2193.codfw.wmnet with reason: Maintenance
19:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55838 and previous config saved to /var/cache/conftool/dbconfig/20240129-193254-marostegui.json
19:29 zabe@deploy2002: zabe: Continuing with sync
19:28 zabe@deploy2002: zabe: Backport for Start reading from af_actor/afh_actor everywhere (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
19:27 zabe@deploy2002: Started scap: Backport for Start reading from af_actor/afh_actor everywhere (T355616)
19:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P55837 and previous config saved to /var/cache/conftool/dbconfig/20240129-191748-marostegui.json
19:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P55836 and previous config saved to /var/cache/conftool/dbconfig/20240129-190241-marostegui.json
19:01 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:01 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
19:00 ayounsi@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: CR993089 - ayounsi@cumin1002
18:59 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
18:59 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
18:58 ayounsi@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: CR993089 - ayounsi@cumin1002
18:49 brouberol@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop test cluster: Restart of jvm daemons.
18:49 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
18:49 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
18:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55835 and previous config saved to /var/cache/conftool/dbconfig/20240129-184735-marostegui.json
18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55834 and previous config saved to /var/cache/conftool/dbconfig/20240129-182909-marostegui.json
18:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
18:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
18:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55833 and previous config saved to /var/cache/conftool/dbconfig/20240129-182846-marostegui.json
18:24 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
18:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P55832 and previous config saved to /var/cache/conftool/dbconfig/20240129-181340-marostegui.json
17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P55831 and previous config saved to /var/cache/conftool/dbconfig/20240129-175833-marostegui.json
17:43 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
17:43 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
17:43 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55830 and previous config saved to /var/cache/conftool/dbconfig/20240129-174327-marostegui.json
17:43 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
17:42 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
17:42 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
17:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55829 and previous config saved to /var/cache/conftool/dbconfig/20240129-173435-marostegui.json
17:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2171.codfw.wmnet with reason: Maintenance
17:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2171.codfw.wmnet with reason: Maintenance
17:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
17:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
17:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T355609)', diff saved to https://phabricator.wikimedia.org/P55828 and previous config saved to /var/cache/conftool/dbconfig/20240129-173406-marostegui.json
17:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P55824 and previous config saved to /var/cache/conftool/dbconfig/20240129-171859-marostegui.json
17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P55823 and previous config saved to /var/cache/conftool/dbconfig/20240129-170353-marostegui.json
16:51 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 06m 37s)
16:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T355609)', diff saved to https://phabricator.wikimedia.org/P55822 and previous config saved to /var/cache/conftool/dbconfig/20240129-164846-marostegui.json
16:44 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 07m 04s)
16:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T355609)', diff saved to https://phabricator.wikimedia.org/P55821 and previous config saved to /var/cache/conftool/dbconfig/20240129-164005-marostegui.json
16:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
16:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
16:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance
16:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance
16:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T355609)', diff saved to https://phabricator.wikimedia.org/P55820 and previous config saved to /var/cache/conftool/dbconfig/20240129-163926-marostegui.json
16:36 volans: installed spicerack 8.3.0 on cumin1002, cumin1001
16:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P55819 and previous config saved to /var/cache/conftool/dbconfig/20240129-162420-marostegui.json
16:20 ladsgroup@deploy2002: Finished scap: Backport for Drop old virtual domain for url shortener (duration: 09m 24s)
16:14 ladsgroup@deploy2002: ladsgroup: Continuing with sync
16:12 ladsgroup@deploy2002: ladsgroup: Backport for Drop old virtual domain for url shortener synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
16:11 ladsgroup@deploy2002: Started scap: Backport for Drop old virtual domain for url shortener
16:10 urandom: decommissioning restbase2019/cassandra-{a,b,c} — T352469
16:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P55817 and previous config saved to /var/cache/conftool/dbconfig/20240129-160913-marostegui.json
16:08 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2019.codfw.wmnet with reason: Decommissioning — T352469
16:07 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2019.codfw.wmnet with reason: Decommissioning — T352469
15:58 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-tool1009.eqiad.wmnet with OS buster
15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T355609)', diff saved to https://phabricator.wikimedia.org/P55816 and previous config saved to /var/cache/conftool/dbconfig/20240129-155406-marostegui.json
15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T355609)', diff saved to https://phabricator.wikimedia.org/P55815 and previous config saved to /var/cache/conftool/dbconfig/20240129-154444-marostegui.json
15:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance
15:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance
15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T355609)', diff saved to https://phabricator.wikimedia.org/P55814 and previous config saved to /var/cache/conftool/dbconfig/20240129-154422-marostegui.json
15:34 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
15:31 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
15:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P55811 and previous config saved to /var/cache/conftool/dbconfig/20240129-152915-marostegui.json
15:26 Dreamy_Jazz: Running MediaModeration scanning script using `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt` on a tmux session.
15:24 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
15:23 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
15:21 Dreamy_Jazz: Running `foreachwikiindblist group1.dblist extensions/MediaModeration/maintenance/resendMatchEmails.php 20200405 --verbose`
15:19 Dreamy_Jazz: Running `foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/resendMatchEmails.php 20200405`
15:17 Dreamy_Jazz: Stopping mediamoderation scanning script
15:17 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-tool1009.eqiad.wmnet with OS buster
15:15 Dreamy_Jazz: afternoon UTC backport window done
15:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P55810 and previous config saved to /var/cache/conftool/dbconfig/20240129-151409-marostegui.json
15:14 dreamyjazz@deploy2002: Finished scap: Backport for Make the email subject unique for positive match emails (T355752) (duration: 21m 21s)
15:13 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts sretest1005.eqiad.wmnet
15:13 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:13 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
15:12 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin2002"
15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1006.eqiad.wmnet
15:04 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1001.eqiad.wmnet
15:04 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
15:04 dreamyjazz@deploy2002: dreamyjazz: Backport for Make the email subject unique for positive match emails (T355752) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:00 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T355609)', diff saved to https://phabricator.wikimedia.org/P55809 and previous config saved to /var/cache/conftool/dbconfig/20240129-145902-marostegui.json
14:58 hashar@deploy2002: Finished deploy [gerrit/gerrit@5594608]: wm-checks-api: direct link to build when only one failed - T355774 (duration: 00m 07s)
14:58 hashar@deploy2002: Started deploy [gerrit/gerrit@5594608]: wm-checks-api: direct link to build when only one failed - T355774
14:57 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp1001.eqiad.wmnet
14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2129 (T355609)', diff saved to https://phabricator.wikimedia.org/P55808 and previous config saved to /var/cache/conftool/dbconfig/20240129-145652-marostegui.json
14:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
14:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
14:56 ayounsi@cumin2002: START - Cookbook sre.hosts.decommission for hosts sretest1005.eqiad.wmnet
14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T355609)', diff saved to https://phabricator.wikimedia.org/P55807 and previous config saved to /var/cache/conftool/dbconfig/20240129-145630-marostegui.json
14:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2055.codfw.wmnet
14:54 Dreamy_Jazz: scap backport is also backporting 993499 for T355357
14:53 ayounsi@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host sretest1005.eqiad.wmnet
14:53 ayounsi@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
14:52 dreamyjazz@deploy2002: Started scap: Backport for Make the email subject unique for positive match emails (T355752)
14:52 ayounsi@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
14:51 dreamyjazz@deploy2002: sync-world aborted: Backport for Make the email subject unique for positive match emails (T355752) (duration: 04m 13s)
14:51 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
14:50 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2055.codfw.wmnet
14:50 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
14:49 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest1005.eqiad.wmnet on all recursors
14:49 ayounsi@cumin2002: START - Cookbook sre.dns.wipe-cache sretest1005.eqiad.wmnet on all recursors
14:49 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:49 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
14:48 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM sretest1005.eqiad.wmnet - ayounsi@cumin2002"
14:47 dreamyjazz@deploy2002: Started scap: Backport for Make the email subject unique for positive match emails (T355752)
14:46 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for hewikinews: remove wgExtraGenderNamespaces and add wgNamespaceAliases (T349581) (duration: 12m 29s)
14:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-airflow1006.eqiad.wmnet
14:42 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
14:42 ayounsi@cumin2002: START - Cookbook sre.ganeti.makevm for new host sretest1005.eqiad.wmnet
14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P55806 and previous config saved to /var/cache/conftool/dbconfig/20240129-144124-marostegui.json
14:40 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::analytics_product
14:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Continuing with sync
14:37 brouberol@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-tool1009.eqiad.wmnet with OS bullseye
14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Backport for hewikinews: remove wgExtraGenderNamespaces and add wgNamespaceAliases (T349581) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:34 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for hewikinews: remove wgExtraGenderNamespaces and add wgNamespaceAliases (T349581)
14:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::analytics_product
14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for knwiki: add portal namespace and fix talkpagenames of draft and module namespace (T355662 T346583) (duration: 08m 58s)
14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P55804 and previous config saved to /var/cache/conftool/dbconfig/20240129-142617-marostegui.json
14:23 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ceph2001.codfw.wmnet with OS bullseye
14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Continuing with sync
14:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 anzx and lucaswerkmeister-wmde: Backport for knwiki: add portal namespace and fix talkpagenames of draft and module namespace (T355662 T346583) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:21 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for knwiki: add portal namespace and fix talkpagenames of draft and module namespace (T355662 T346583)
14:17 volans: upgraded spicerack to 8.3.0 on cumin2002
14:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for uzwiki: revert temporary logo for the 20th anniversary (T353723) (duration: 11m 01s)
14:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T355609)', diff saved to https://phabricator.wikimedia.org/P55803 and previous config saved to /var/cache/conftool/dbconfig/20240129-141111-marostegui.json
14:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1006.eqiad.wmnet with OS bullseye
14:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Continuing with sync
14:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Backport for uzwiki: revert temporary logo for the 20th anniversary (T353723) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for uzwiki: revert temporary logo for the 20th anniversary (T353723)
14:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T355609)', diff saved to https://phabricator.wikimedia.org/P55802 and previous config saved to /var/cache/conftool/dbconfig/20240129-140205-marostegui.json
14:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
14:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
14:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55801 and previous config saved to /var/cache/conftool/dbconfig/20240129-140142-marostegui.json
13:54 volans: uploaded spicerack_8.3.0 to apt.wikimedia.org bullseye-wikimedia
13:48 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2355.codfw.wmnet with OS bullseye
13:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P55799 and previous config saved to /var/cache/conftool/dbconfig/20240129-134636-marostegui.json
13:46 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2445.codfw.wmnet with OS bullseye
13:40 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2429.codfw.wmnet with OS bullseye
13:40 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1006.eqiad.wmnet with reason: host reimage
13:37 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2381.codfw.wmnet with OS bullseye
13:36 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1006.eqiad.wmnet with reason: host reimage
13:35 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2260.codfw.wmnet with OS bullseye
13:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P55798 and previous config saved to /var/cache/conftool/dbconfig/20240129-133129-marostegui.json
13:29 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2355.codfw.wmnet with reason: host reimage
13:26 claime: Restarting ferm.service on k8s node kubernetes2055 - T354855
13:25 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2445.codfw.wmnet with reason: host reimage
13:23 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-airflow1006.eqiad.wmnet with OS bullseye
13:23 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
13:20 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2429.codfw.wmnet with reason: host reimage
13:18 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2381.codfw.wmnet with reason: host reimage
13:17 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2445.codfw.wmnet with reason: host reimage
13:16 brouberol@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage
13:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2429.codfw.wmnet with reason: host reimage
13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55797 and previous config saved to /var/cache/conftool/dbconfig/20240129-131623-marostegui.json
13:15 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2260.codfw.wmnet with reason: host reimage
13:14 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2381.codfw.wmnet with reason: host reimage
13:13 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2355.codfw.wmnet with reason: host reimage
13:12 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2260.codfw.wmnet with reason: host reimage
13:07 brouberol@cumin1002: START - Cookbook sre.hosts.reimage for host an-tool1009.eqiad.wmnet with OS bullseye
13:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55796 and previous config saved to /var/cache/conftool/dbconfig/20240129-130724-marostegui.json
13:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
13:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
13:00 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2445.codfw.wmnet with OS bullseye
12:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2429.codfw.wmnet with OS bullseye
12:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
12:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
12:58 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2381.codfw.wmnet with OS bullseye
12:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
12:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
12:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55795 and previous config saved to /var/cache/conftool/dbconfig/20240129-125726-marostegui.json
12:57 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2355.codfw.wmnet with OS bullseye
12:56 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2260.codfw.wmnet with OS bullseye
12:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P55794 and previous config saved to /var/cache/conftool/dbconfig/20240129-124220-marostegui.json
12:33 moritzm: installing openssh security updates
12:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P55793 and previous config saved to /var/cache/conftool/dbconfig/20240129-122713-marostegui.json
12:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1007.eqiad.wmnet
12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-airflow1007.eqiad.wmnet
12:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::wmde
12:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55792 and previous config saved to /var/cache/conftool/dbconfig/20240129-121205-marostegui.json
12:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55791 and previous config saved to /var/cache/conftool/dbconfig/20240129-120628-marostegui.json
12:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1231.eqiad.wmnet with reason: Maintenance
12:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1231.eqiad.wmnet with reason: Maintenance
12:00 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::wmde
12:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
11:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55790 and previous config saved to /var/cache/conftool/dbconfig/20240129-115953-marostegui.json
11:53 Dreamy_Jazz: Running mwscript maintenance/sql.php --wiki=testwiki --wikidb=centralauth ~/T354700-create-table-global.sql for T354700
11:45 Dreamy_Jazz: sql.php finished for T354700
11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P55789 and previous config saved to /var/cache/conftool/dbconfig/20240129-114446-marostegui.json
11:41 Dreamy_Jazz: T354700 - Running `foreachwiki maintenance/sql.php ~/T354700-create-table.sql`
11:39 Dreamy_Jazz: T354700 - Ran mwscript maintenance/sql.php --wiki=testwiki ~/T354700-create-table.sql
11:38 moritzm: upload ganeti 3.0.2-3+wmf1 (bookworm package of Ganeti plus backport for SSL chain handling in RAPI) to apt.wikimedia.org T300152
11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P55788 and previous config saved to /var/cache/conftool/dbconfig/20240129-112940-marostegui.json
11:28 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1007.eqiad.wmnet with OS bullseye
11:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55787 and previous config saved to /var/cache/conftool/dbconfig/20240129-111434-marostegui.json
11:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55786 and previous config saved to /var/cache/conftool/dbconfig/20240129-110955-marostegui.json
11:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1224.eqiad.wmnet with reason: Maintenance
11:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1224.eqiad.wmnet with reason: Maintenance
11:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55785 and previous config saved to /var/cache/conftool/dbconfig/20240129-110933-marostegui.json
11:05 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1007.eqiad.wmnet with reason: host reimage
11:01 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1007.eqiad.wmnet with reason: host reimage
10:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P55784 and previous config saved to /var/cache/conftool/dbconfig/20240129-105427-marostegui.json
10:53 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1054.eqiad.wmnet
10:53 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2054.codfw.wmnet
10:47 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2054.codfw.wmnet
10:47 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1054.eqiad.wmnet
10:47 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-airflow1007.eqiad.wmnet with OS bullseye
10:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P55783 and previous config saved to /var/cache/conftool/dbconfig/20240129-103920-marostegui.json
10:38 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
10:37 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
10:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55782 and previous config saved to /var/cache/conftool/dbconfig/20240129-102414-marostegui.json
10:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213:3316 (T355609)', diff saved to https://phabricator.wikimedia.org/P55781 and previous config saved to /var/cache/conftool/dbconfig/20240129-101757-marostegui.json
10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1213.eqiad.wmnet with reason: Maintenance
10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1213.eqiad.wmnet with reason: Maintenance
10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T355609)', diff saved to https://phabricator.wikimedia.org/P55780 and previous config saved to /var/cache/conftool/dbconfig/20240129-101735-marostegui.json
10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P55779 and previous config saved to /var/cache/conftool/dbconfig/20240129-100229-marostegui.json
10:01 moritzm: upload prometheus-ganeti-exporter 0.3+deb12u1 to apt.wikimedia.org T300152
09:56 XioNoX: enable Puppet on all the ganeti servers for CR990968 deployment - T300152
09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P55778 and previous config saved to /var/cache/conftool/dbconfig/20240129-094722-marostegui.json
09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T355609)', diff saved to https://phabricator.wikimedia.org/P55777 and previous config saved to /var/cache/conftool/dbconfig/20240129-093216-marostegui.json
09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T355609)', diff saved to https://phabricator.wikimedia.org/P55776 and previous config saved to /var/cache/conftool/dbconfig/20240129-092724-marostegui.json
09:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1201.eqiad.wmnet with reason: Maintenance
09:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1201.eqiad.wmnet with reason: Maintenance
09:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T355609)', diff saved to https://phabricator.wikimedia.org/P55775 and previous config saved to /var/cache/conftool/dbconfig/20240129-092702-marostegui.json
09:17 godog: mark for deletetion and cleanup replicated thanos blocks for prometheus=ops, older than 3 months, all resolutions - T351927
09:13 moritzm: upgrading python-pymysql in S7 DB hosts to 1.0.2-2~wmf11u1 T355531
09:13 XioNoX: disable Puppet on all the ganeti servers for CR990968 deployment - T300152
09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P55773 and previous config saved to /var/cache/conftool/dbconfig/20240129-091156-marostegui.json
08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P55772 and previous config saved to /var/cache/conftool/dbconfig/20240129-085649-marostegui.json
08:46 marostegui@deploy2002: Finished scap: Backport for Revert "ProductionServices.php: Promote pc2014" (duration: 17m 13s)
08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T355609)', diff saved to https://phabricator.wikimedia.org/P55771 and previous config saved to /var/cache/conftool/dbconfig/20240129-084143-marostegui.json
08:39 marostegui@deploy2002: marostegui: Continuing with sync
08:39 marostegui@deploy2002: marostegui: Backport for Revert "ProductionServices.php: Promote pc2014" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T355609)', diff saved to https://phabricator.wikimedia.org/P55770 and previous config saved to /var/cache/conftool/dbconfig/20240129-083627-marostegui.json
08:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1187.eqiad.wmnet with reason: Maintenance
08:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1187.eqiad.wmnet with reason: Maintenance
08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55769 and previous config saved to /var/cache/conftool/dbconfig/20240129-083603-marostegui.json
08:29 marostegui@deploy2002: Started scap: Backport for Revert "ProductionServices.php: Promote pc2014"
08:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P55768 and previous config saved to /var/cache/conftool/dbconfig/20240129-082057-marostegui.json
08:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P55767 and previous config saved to /var/cache/conftool/dbconfig/20240129-080550-marostegui.json
07:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55766 and previous config saved to /var/cache/conftool/dbconfig/20240129-075044-marostegui.json
07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T355609)', diff saved to https://phabricator.wikimedia.org/P55765 and previous config saved to /var/cache/conftool/dbconfig/20240129-074541-marostegui.json
07:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
07:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T355609)', diff saved to https://phabricator.wikimedia.org/P55764 and previous config saved to /var/cache/conftool/dbconfig/20240129-074519-marostegui.json
07:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P55763 and previous config saved to /var/cache/conftool/dbconfig/20240129-073857-root.json
07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P55762 and previous config saved to /var/cache/conftool/dbconfig/20240129-073012-marostegui.json
07:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P55761 and previous config saved to /var/cache/conftool/dbconfig/20240129-072352-root.json
07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P55760 and previous config saved to /var/cache/conftool/dbconfig/20240129-071506-marostegui.json
07:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P55758 and previous config saved to /var/cache/conftool/dbconfig/20240129-070847-root.json
07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T355609)', diff saved to https://phabricator.wikimedia.org/P55757 and previous config saved to /var/cache/conftool/dbconfig/20240129-065959-marostegui.json
06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T355609)', diff saved to https://phabricator.wikimedia.org/P55756 and previous config saved to /var/cache/conftool/dbconfig/20240129-065450-marostegui.json
06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
06:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T355609)', diff saved to https://phabricator.wikimedia.org/P55755 and previous config saved to /var/cache/conftool/dbconfig/20240129-065427-marostegui.json
06:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P55754 and previous config saved to /var/cache/conftool/dbconfig/20240129-065341-root.json
06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P55752 and previous config saved to /var/cache/conftool/dbconfig/20240129-063920-marostegui.json
06:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P55751 and previous config saved to /var/cache/conftool/dbconfig/20240129-063836-root.json
06:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129', diff saved to https://phabricator.wikimedia.org/P55750 and previous config saved to /var/cache/conftool/dbconfig/20240129-063302-marostegui.json
06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P55747 and previous config saved to /var/cache/conftool/dbconfig/20240129-062414-marostegui.json
06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T355609)', diff saved to https://phabricator.wikimedia.org/P55746 and previous config saved to /var/cache/conftool/dbconfig/20240129-060907-marostegui.json
06:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T355609)', diff saved to https://phabricator.wikimedia.org/P55745 and previous config saved to /var/cache/conftool/dbconfig/20240129-060400-marostegui.json
06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
05:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1134.eqiad.wmnet
05:57 marostegui@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
05:57 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1134.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
05:56 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1134.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002"
05:54 marostegui@cumin1002: START - Cookbook sre.dns.netbox
05:49 marostegui@cumin1002: START - Cookbook sre.hosts.decommission for hosts db1134.eqiad.wmnet

2024-01-28

01:11 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2016.codfw.wmnet with reason: Decommissioning — T352469
01:11 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2016.codfw.wmnet with reason: Decommissioning — T352469
01:10 urandom: decommissioning restbase2016/cassandra-{a,b,c} — T352469

2024-01-26

22:07 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host cloudelastic1006.wikimedia.org
22:06 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host cloudelastic1006.wikimedia.org
22:05 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host cloudelastic1006.wikimedia.org
22:04 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host cloudelastic1006.wikimedia.org
19:02 ejegg: fundraising civicrm upgraded from 8c0dc1d2 to b953d667
18:27 mutante: cloudweb1003 - OATHAuth disabled for Triciaburmeister. (after video verification - T355958)
18:16 mutante: phab1004 - removing 2fa from TBurmeister (after video verification) T355958
17:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
17:57 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
17:53 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
17:37 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
17:34 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
17:17 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
17:12 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
17:11 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
17:09 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:09 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sync cloudelastic1010 IPs - bking@cumin2002"
17:08 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sync cloudelastic1010 IPs - bking@cumin2002"
17:04 bking@cumin2002: START - Cookbook sre.dns.netbox
16:34 bking@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudelastic1010.wikimedia.org
16:33 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:33 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1010.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
16:33 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
16:32 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1010.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
16:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2169 in db2194 for T343674', diff saved to https://phabricator.wikimedia.org/P55740 and previous config saved to /var/cache/conftool/dbconfig/20240126-163057-arnaudb.json
16:29 bking@cumin2002: START - Cookbook sre.dns.netbox
16:23 bking@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic1010.wikimedia.org
16:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: activate new elastic config - bking@cumin2002 - T355617
15:01 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
15:00 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:47 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:46 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:37 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:37 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:36 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2015.codfw.wmnet with reason: Decommissioning — T352469
14:35 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2015.codfw.wmnet with reason: Decommissioning — T352469
14:34 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:34 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:33 urandom: decommissioning restbase2015/cassandra-{a,b,c} — T352469
14:27 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:27 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:24 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:24 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:08 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
14:08 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
13:18 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Gitlab security upgrade
12:36 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:36 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster svc - ayounsi@cumin1002"
12:35 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw routed cluster svc - ayounsi@cumin1002"
12:30 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
11:43 taavi: reprepro: copy helm-diff_3.1.3-2 from bullseye-wikimedia to bookworm-wikimedia
11:28 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Gitlab security upgrade
10:52 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
10:51 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet
10:50 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Gitlab security upgrade
10:44 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Gitlab security upgrade
10:36 moritzm: prune obsolete nginx packages from eventschema hosts after migration to new library scheme T329529
10:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Cloning db2169 in db2194 for T343674', diff saved to https://phabricator.wikimedia.org/P55737 and previous config saved to /var/cache/conftool/dbconfig/20240126-102550-arnaudb.json
08:01 moritzm: rebalance codfw/B following switch maintenance T355549
07:54 moritzm: failover ganeti master for codfw back to ganeti2022, switch maintenance is completed T355549
01:01 dzahn@cumin1002: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1004.wikimedia.org with reason: security release
00:07 dzahn@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: security release
00:00 dzahn@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: security release

2024-01-25

23:54 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=wikimaniawiki --fix # T347622
23:54 zabe@deploy2002: Finished scap: Backport for Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622) (duration: 08m 30s)
23:47 zabe@deploy2002: robertsky and zabe: Continuing with sync
23:47 zabe@deploy2002: robertsky and zabe: Backport for Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
23:45 zabe@deploy2002: Started scap: Backport for Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622)
23:29 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Sturm . # T355485
23:17 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cloudelastic1010.wikimedia.org with reason: migration canary T355617
23:17 bking@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on cloudelastic1010.wikimedia.org with reason: migration canary T355617
22:54 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic1010.wikimedia.org for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
22:53 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010.wikimedia.org for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
22:53 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
22:53 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
22:52 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
22:52 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617
22:40 ryankemper: T351354 Restarting `cloudelastic1006` (final restart for today)
22:34 ryankemper: T351354 Now restarting new masters to keep configs in sync; restarting `cloudelastic1009`
22:33 ryankemper: T351354 Now restarting new masters to keep configs in sync; restarting `cloudelastic1007`
22:26 ryankemper: T351354 Restarting `cloudelastic1002`
22:19 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:19 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
22:15 ryankemper: T351354 Restarting `cloudelastic1004` following puppet run
22:12 dzahn@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: security release
22:11 ryankemper: T351354 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/993038; restarting `cloudelastic1001` following puppet run
22:08 ryankemper: T351354 Downtimed `cloudelastic*`; shortly will restart `cloudelastic100[1,2,4]` one host at a time to make them no longer masters
22:08 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: cloudelastic maintenance
22:07 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: cloudelastic maintenance
21:55 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:55 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
21:44 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:44 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
21:44 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:44 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
21:19 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:19 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
21:14 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:14 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
21:13 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:13 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:58 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:58 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:57 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:57 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:56 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:56 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:55 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1002.eqiad.wmnet with OS bookworm
20:55 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
20:54 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
20:51 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1001.eqiad.wmnet with OS bookworm
20:51 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
20:50 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
20:37 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:37 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:36 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1002.eqiad.wmnet with reason: host reimage
20:35 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:35 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:33 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:33 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:33 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1002.eqiad.wmnet with reason: host reimage
20:32 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1001.eqiad.wmnet with reason: host reimage
20:27 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1001.eqiad.wmnet with reason: host reimage
20:26 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set cloudrabbit1001/2 as active - taavi@cumin1002"
20:25 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set cloudrabbit1001/2 as active - taavi@cumin1002"
20:19 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1002.eqiad.wmnet with OS bookworm
20:19 taavi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudrabbit1002.eqiad.wmnet with OS bookworm
20:16 zabe@deploy2002: Finished scap: Backport for Start reading from af_actor/afh_actor in group1 wikis (T355616) (duration: 11m 27s)
20:15 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
20:15 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
20:11 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1002.eqiad.wmnet with OS bookworm
20:10 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1001.eqiad.wmnet with OS bookworm
20:10 zabe@deploy2002: zabe: Continuing with sync
20:09 zabe@deploy2002: zabe: Backport for Start reading from af_actor/afh_actor in group1 wikis (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
20:06 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudrabbit1001
20:05 zabe@deploy2002: Started scap: Backport for Start reading from af_actor/afh_actor in group1 wikis (T355616)
20:05 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudrabbit1001
20:05 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:05 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1001 - taavi@cumin1002"
20:04 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1001 - taavi@cumin1002"
20:02 taavi@cumin1002: START - Cookbook sre.dns.netbox
20:01 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudrabbit1002
20:00 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudrabbit1002
19:59 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:59 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1002 - taavi@cumin1002"
19:58 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add IPs for cloudrabbit1002 - taavi@cumin1002"
19:56 taavi@cumin1002: START - Cookbook sre.dns.netbox
19:29 bking@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
19:29 bking@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
19:28 ebernhardson@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:28 ebernhardson@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
19:25 bking@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
19:24 bking@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
18:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55736 and previous config saved to /var/cache/conftool/dbconfig/20240125-184922-root.json
18:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55735 and previous config saved to /var/cache/conftool/dbconfig/20240125-184917-root.json
18:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55734 and previous config saved to /var/cache/conftool/dbconfig/20240125-184911-root.json
18:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55733 and previous config saved to /var/cache/conftool/dbconfig/20240125-184906-root.json
18:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55732 and previous config saved to /var/cache/conftool/dbconfig/20240125-184900-root.json
18:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55731 and previous config saved to /var/cache/conftool/dbconfig/20240125-184853-root.json
18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55730 and previous config saved to /var/cache/conftool/dbconfig/20240125-184845-root.json
18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55729 and previous config saved to /var/cache/conftool/dbconfig/20240125-184839-root.json
18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 100%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55728 and previous config saved to /var/cache/conftool/dbconfig/20240125-184823-root.json
18:47 mutante: phab2002 - rebooting
18:46 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: reboot
18:45 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: reboot
18:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55727 and previous config saved to /var/cache/conftool/dbconfig/20240125-183417-root.json
18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55726 and previous config saved to /var/cache/conftool/dbconfig/20240125-183412-root.json
18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55725 and previous config saved to /var/cache/conftool/dbconfig/20240125-183406-root.json
18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55724 and previous config saved to /var/cache/conftool/dbconfig/20240125-183401-root.json
18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55723 and previous config saved to /var/cache/conftool/dbconfig/20240125-183355-root.json
18:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55722 and previous config saved to /var/cache/conftool/dbconfig/20240125-183348-root.json
18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55721 and previous config saved to /var/cache/conftool/dbconfig/20240125-183340-root.json
18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55720 and previous config saved to /var/cache/conftool/dbconfig/20240125-183334-root.json
18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 75%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55719 and previous config saved to /var/cache/conftool/dbconfig/20240125-183318-root.json
18:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55718 and previous config saved to /var/cache/conftool/dbconfig/20240125-181912-root.json
18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55717 and previous config saved to /var/cache/conftool/dbconfig/20240125-181907-root.json
18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55716 and previous config saved to /var/cache/conftool/dbconfig/20240125-181901-root.json
18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55715 and previous config saved to /var/cache/conftool/dbconfig/20240125-181856-root.json
18:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55714 and previous config saved to /var/cache/conftool/dbconfig/20240125-181850-root.json
18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55713 and previous config saved to /var/cache/conftool/dbconfig/20240125-181843-root.json
18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55712 and previous config saved to /var/cache/conftool/dbconfig/20240125-181835-root.json
18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55711 and previous config saved to /var/cache/conftool/dbconfig/20240125-181829-root.json
18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 50%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55710 and previous config saved to /var/cache/conftool/dbconfig/20240125-181814-root.json
18:13 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum6001.drmrs.wmnet with OS bookworm
18:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55709 and previous config saved to /var/cache/conftool/dbconfig/20240125-180407-root.json
18:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55708 and previous config saved to /var/cache/conftool/dbconfig/20240125-180402-root.json
18:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55707 and previous config saved to /var/cache/conftool/dbconfig/20240125-180356-root.json
18:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55706 and previous config saved to /var/cache/conftool/dbconfig/20240125-180351-root.json
18:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55705 and previous config saved to /var/cache/conftool/dbconfig/20240125-180345-root.json
18:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55704 and previous config saved to /var/cache/conftool/dbconfig/20240125-180338-root.json
18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55703 and previous config saved to /var/cache/conftool/dbconfig/20240125-180330-root.json
18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55702 and previous config saved to /var/cache/conftool/dbconfig/20240125-180324-root.json
18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 25%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55701 and previous config saved to /var/cache/conftool/dbconfig/20240125-180308-root.json
18:01 sukhe: running authdns-update for CR 993008: T355835
17:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2140.codfw.wmnet with reason: Maintenance
17:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2140.codfw.wmnet with reason: Maintenance
17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55700 and previous config saved to /var/cache/conftool/dbconfig/20240125-174902-root.json
17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2178 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55699 and previous config saved to /var/cache/conftool/dbconfig/20240125-174857-root.json
17:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55698 and previous config saved to /var/cache/conftool/dbconfig/20240125-174851-root.json
17:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2147 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55697 and previous config saved to /var/cache/conftool/dbconfig/20240125-174846-root.json
17:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3315 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55696 and previous config saved to /var/cache/conftool/dbconfig/20240125-174840-root.json
17:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55695 and previous config saved to /var/cache/conftool/dbconfig/20240125-174833-root.json
17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2107 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55694 and previous config saved to /var/cache/conftool/dbconfig/20240125-174825-root.json
17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2109 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55693 and previous config saved to /var/cache/conftool/dbconfig/20240125-174819-root.json
17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 10%: After network maintenance', diff saved to https://phabricator.wikimedia.org/P55692 and previous config saved to /var/cache/conftool/dbconfig/20240125-174803-root.json
17:47 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum6001.drmrs.wmnet with reason: host reimage
17:45 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for asw-b-codfw,lsw1-b5-codfw.mgmt
17:45 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for asw-b-codfw,lsw1-b5-codfw.mgmt
17:43 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum6001.drmrs.wmnet with reason: host reimage
17:38 btullis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
17:34 btullis@deploy2002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
17:33 btullis@deploy2002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
17:30 Amir1: deploying new captchas (T141490)
17:22 btullis@deploy2002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
17:22 btullis@deploy2002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
17:21 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host durum6001.drmrs.wmnet with OS bookworm
17:17 btullis@deploy2002: helmfile [staging] START helmfile.d/services/datahub: apply on main
17:09 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:09 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
17:07 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:05 taavi@cumin1002: START - Cookbook sre.dns.netbox
17:04 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudrabbit[1001-1002].wikimedia.org
17:04 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:04 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit[1001-1002].wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
17:01 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
17:01 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
17:00 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit[1001-1002].wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
16:56 taavi@cumin1002: START - Cookbook sre.dns.netbox
16:52 sukhe: running authdns-update for CR 992936: T355835
16:49 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2014.codfw.wmnet with reason: Decommissioning — T352469
16:49 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2014.codfw.wmnet with reason: Decommissioning — T352469
16:48 taavi@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudrabbit[1001-1002].wikimedia.org
16:48 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
16:48 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
16:43 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 32 hosts
16:42 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for 32 hosts
16:42 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr[1-2]-codfw
16:41 cmooney@cumin1002: START - Cookbook sre.hosts.remove-downtime for cr[1-2]-codfw
16:34 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=parse2007.codfw.wmnet
16:34 claime: repooling parse2007 - T355549
16:33 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=parse2006.codfw.wmnet
16:33 claime: repooling parse2006 - T355549
16:32 claime: uncordoning kubernetes2023 - T355549
16:32 claime: uncordoning kubernetes2032 - T355549
16:29 claime: uncordoning kubernetes2031 - T355549
16:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T354336)', diff saved to https://phabricator.wikimedia.org/P55691 and previous config saved to /var/cache/conftool/dbconfig/20240125-161320-marostegui.json
16:03 topranks: Network maintenance codfw rack b5 underway T355549
15:58 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on 32 hosts with reason: Migrating servers in codfw rack B5 to lsw1-b5-codfw T355549
15:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55690 and previous config saved to /var/cache/conftool/dbconfig/20240125-155813-marostegui.json
15:58 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:30:00 on 32 hosts with reason: Migrating servers in codfw rack B5 to lsw1-b5-codfw T355549
15:57 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on cr[1-2]-codfw with reason: prepping for server uplink migration
15:57 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:30:00 on cr[1-2]-codfw with reason: prepping for server uplink migration
15:54 arnaudb@cumin1002: dbctl commit (dc=all): 'preparing to clone db2169 on db2196 as per TT343674', diff saved to https://phabricator.wikimedia.org/P55689 and previous config saved to /var/cache/conftool/dbconfig/20240125-155450-arnaudb.json
15:52 topranks: disabling puppet fleet-wide to allow for maintenance in codfw rack b5 which hosts puppetmaster2003 T355549
15:46 topranks: configuring lsw1-b5-codfw switch ports for servers to be moved T355549
15:46 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on asw-b-codfw,lsw1-b5-codfw.mgmt with reason: prepping for server uplink migration
15:46 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on asw-b-codfw,lsw1-b5-codfw.mgmt with reason: prepping for server uplink migration
15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55688 and previous config saved to /var/cache/conftool/dbconfig/20240125-154307-marostegui.json
15:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wcqs::public
15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T354336)', diff saved to https://phabricator.wikimedia.org/P55687 and previous config saved to /var/cache/conftool/dbconfig/20240125-152801-marostegui.json
15:25 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wcqs::public
15:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wdqs::internal
15:20 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2006.cofw.wmnet
15:19 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
15:18 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
15:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: wdqs::internal
14:35 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=parse2007.codfw.wmnet
14:35 claime: Depooling parse2007 (setting inactive) - T355549
14:34 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=parse2006.codfw.wmnet
14:34 claime: Depooling parse2006 (setting inactive) - T355549
14:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2179 (T354336)', diff saved to https://phabricator.wikimedia.org/P55684 and previous config saved to /var/cache/conftool/dbconfig/20240125-142729-marostegui.json
14:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2179.codfw.wmnet with reason: Maintenance
14:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2179.codfw.wmnet with reason: Maintenance
14:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55683 and previous config saved to /var/cache/conftool/dbconfig/20240125-142706-marostegui.json
14:26 moritzm: installing debmonitor-client 0.3.4 fleet-wide
14:25 claime: Draining kubernetes2023 - T355549
14:25 claime: Draining kubernetes2033 - T355549
14:23 claime: Draining kubernetes2032 - T355549
14:21 claime: Draining kubernetes2031 - T355549
14:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: After T355885', diff saved to https://phabricator.wikimedia.org/P55682 and previous config saved to /var/cache/conftool/dbconfig/20240125-142102-root.json
14:18 btullis@cumin1002: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
14:15 moritzm: failover ganeti master for codfw to ganeti2020 T355549
14:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55681 and previous config saved to /var/cache/conftool/dbconfig/20240125-141200-marostegui.json
14:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: After T355885', diff saved to https://phabricator.wikimedia.org/P55680 and previous config saved to /var/cache/conftool/dbconfig/20240125-140557-root.json
14:05 btullis@cumin1002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
13:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55679 and previous config saved to /var/cache/conftool/dbconfig/20240125-135653-marostegui.json
13:53 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid test cluster: Roll restart of Druid jvm daemons.
13:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: After T355885', diff saved to https://phabricator.wikimedia.org/P55678 and previous config saved to /var/cache/conftool/dbconfig/20240125-135052-root.json
13:47 volans: uploaded debmonitor-client_0.3.4 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia,bookworm-wikimedia
13:43 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid test cluster: Roll restart of Druid jvm daemons.
13:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55677 and previous config saved to /var/cache/conftool/dbconfig/20240125-134147-marostegui.json
13:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55676 and previous config saved to /var/cache/conftool/dbconfig/20240125-133935-marostegui.json
13:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2172.codfw.wmnet with reason: Maintenance
13:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2172.codfw.wmnet with reason: Maintenance
13:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T354336)', diff saved to https://phabricator.wikimedia.org/P55675 and previous config saved to /var/cache/conftool/dbconfig/20240125-133913-marostegui.json
13:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: After T355885', diff saved to https://phabricator.wikimedia.org/P55674 and previous config saved to /var/cache/conftool/dbconfig/20240125-133547-root.json
13:32 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2022.codfw.wmnet
13:28 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2357.codfw.wmnet with OS bullseye
13:28 topranks: draining VMs from ganeti2022 ahead of codfw rack b5 maintenance T355549
13:27 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2022.codfw.wmnet
13:27 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
13:26 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
13:26 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
13:26 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
13:25 topranks: stopping logstash service on logstash2025 to faciliate VM migration T355549
13:25 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2021.codfw.wmnet
13:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55673 and previous config saved to /var/cache/conftool/dbconfig/20240125-132407-marostegui.json
13:24 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2267.codfw.wmnet with OS bullseye
13:21 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2395.codfw.wmnet with OS bullseye
13:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: After T355885', diff saved to https://phabricator.wikimedia.org/P55672 and previous config saved to /var/cache/conftool/dbconfig/20240125-132043-root.json
13:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
13:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
13:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129', diff saved to https://phabricator.wikimedia.org/P55671 and previous config saved to /var/cache/conftool/dbconfig/20240125-131547-marostegui.json
13:12 hashar@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.15 refs T354433
13:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55670 and previous config saved to /var/cache/conftool/dbconfig/20240125-130900-marostegui.json
13:08 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
13:05 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
13:02 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2021.codfw.wmnet
13:02 topranks: draining VMs from ganeti2021 ahead of codfw rack b5 maintenance T355549
13:02 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
12:58 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
12:58 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
12:57 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
12:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T354336)', diff saved to https://phabricator.wikimedia.org/P55669 and previous config saved to /var/cache/conftool/dbconfig/20240125-125353-marostegui.json
12:41 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2267.codfw.wmnet with OS bullseye
12:41 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2395.codfw.wmnet with OS bullseye
12:41 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2357.codfw.wmnet with OS bullseye
12:12 jgiannelos@deploy2002: Finished deploy [restbase/deploy@708f0f3]: (no justification provided) (duration: 20m 28s)
12:06 moritzm: installing openssh security updates
11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T354336)', diff saved to https://phabricator.wikimedia.org/P55667 and previous config saved to /var/cache/conftool/dbconfig/20240125-115322-marostegui.json
11:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
11:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
11:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2155.codfw.wmnet with reason: Maintenance
11:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2155.codfw.wmnet with reason: Maintenance
11:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T354336)', diff saved to https://phabricator.wikimedia.org/P55666 and previous config saved to /var/cache/conftool/dbconfig/20240125-115233-marostegui.json
11:52 jgiannelos@deploy2002: Started deploy [restbase/deploy@708f0f3]: (no justification provided)
11:45 zabe@deploy2002: Finished scap: Backport for Start reading from af_actor/afh_actor in group0 wikis (T355616) (duration: 08m 25s)
11:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1038.eqiad.wmnet to cluster eqiad and group D
11:42 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1038.eqiad.wmnet to cluster eqiad and group D
11:38 zabe@deploy2002: zabe: Continuing with sync
11:38 zabe@deploy2002: zabe: Backport for Start reading from af_actor/afh_actor in group0 wikis (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55665 and previous config saved to /var/cache/conftool/dbconfig/20240125-113727-marostegui.json
11:36 zabe@deploy2002: Started scap: Backport for Start reading from af_actor/afh_actor in group0 wikis (T355616)
11:29 hashar@deploy2002: Finished scap: Backport for UserGroupManager: Fix cross-wiki database access (T355813) (duration: 08m 50s)
11:26 claime: Restarting ferm.service on k8s node kubernetes2036.codfw.wmnet - T354855
11:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance
11:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance
11:23 hashar@deploy2002: hashar and zabe: Continuing with sync
11:22 hashar@deploy2002: hashar and zabe: Backport for UserGroupManager: Fix cross-wiki database access (T355813) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55664 and previous config saved to /var/cache/conftool/dbconfig/20240125-112220-marostegui.json
11:20 hashar@deploy2002: Started scap: Backport for UserGroupManager: Fix cross-wiki database access (T355813)
11:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T354336)', diff saved to https://phabricator.wikimedia.org/P55663 and previous config saved to /var/cache/conftool/dbconfig/20240125-110714-marostegui.json
11:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2147.codfw.wmnet with reason: Maintenance
11:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2147.codfw.wmnet with reason: Maintenance
11:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2139.codfw.wmnet with reason: Maintenance
11:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2139.codfw.wmnet with reason: Maintenance
11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55662 and previous config saved to /var/cache/conftool/dbconfig/20240125-110521-marostegui.json
10:57 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55660 and previous config saved to /var/cache/conftool/dbconfig/20240125-105015-marostegui.json
10:39 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
10:38 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:35 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
10:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55659 and previous config saved to /var/cache/conftool/dbconfig/20240125-103509-marostegui.json
10:21 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
10:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55658 and previous config saved to /var/cache/conftool/dbconfig/20240125-102002-marostegui.json
10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55657 and previous config saved to /var/cache/conftool/dbconfig/20240125-101750-marostegui.json
10:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
10:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55656 and previous config saved to /var/cache/conftool/dbconfig/20240125-101728-marostegui.json
10:17 moritzm: upgrading python-pymysql in S6 DB hosts to 1.0.2-2~wmf11u1 T355531
10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55655 and previous config saved to /var/cache/conftool/dbconfig/20240125-100221-marostegui.json
09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55654 and previous config saved to /var/cache/conftool/dbconfig/20240125-094714-marostegui.json
09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55653 and previous config saved to /var/cache/conftool/dbconfig/20240125-093208-marostegui.json
09:29 stran@deploy2002: Finished scap: Backport for PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928) (duration: 17m 24s)
09:18 stran@deploy2002: kharlan and stran: Continuing with sync
09:14 stran@deploy2002: kharlan and stran: Backport for PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
09:12 stran@deploy2002: Started scap: Backport for PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928)
08:45 stran@deploy2002: stran and kharlan: Backport for PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
08:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
08:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2136.codfw.wmnet with reason: Maintenance
08:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2136.codfw.wmnet with reason: Maintenance
08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T354336)', diff saved to https://phabricator.wikimedia.org/P55652 and previous config saved to /var/cache/conftool/dbconfig/20240125-083106-marostegui.json
08:16 stran@deploy2002: Started scap: Backport for PreAuthenticationProvider: Allow blocking account creation based on IP reputation (T354928)
08:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55651 and previous config saved to /var/cache/conftool/dbconfig/20240125-081559-marostegui.json
08:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55650 and previous config saved to /var/cache/conftool/dbconfig/20240125-080053-marostegui.json
07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T354336)', diff saved to https://phabricator.wikimedia.org/P55648 and previous config saved to /var/cache/conftool/dbconfig/20240125-074546-marostegui.json
07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2119 (T354336)', diff saved to https://phabricator.wikimedia.org/P55647 and previous config saved to /var/cache/conftool/dbconfig/20240125-074334-marostegui.json
07:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2119.codfw.wmnet with reason: Maintenance
07:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2119.codfw.wmnet with reason: Maintenance
07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T354336)', diff saved to https://phabricator.wikimedia.org/P55646 and previous config saved to /var/cache/conftool/dbconfig/20240125-074312-marostegui.json
07:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55645 and previous config saved to /var/cache/conftool/dbconfig/20240125-073319-root.json
07:33 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55644 and previous config saved to /var/cache/conftool/dbconfig/20240125-073310-root.json
07:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55643 and previous config saved to /var/cache/conftool/dbconfig/20240125-073252-root.json
07:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 100%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55642 and previous config saved to /var/cache/conftool/dbconfig/20240125-073244-root.json
07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55641 and previous config saved to /var/cache/conftool/dbconfig/20240125-072806-marostegui.json
07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2137:3315 T355549', diff saved to https://phabricator.wikimedia.org/P55640 and previous config saved to /var/cache/conftool/dbconfig/20240125-072010-marostegui.json
07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55639 and previous config saved to /var/cache/conftool/dbconfig/20240125-071813-root.json
07:18 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55638 and previous config saved to /var/cache/conftool/dbconfig/20240125-071805-root.json
07:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55637 and previous config saved to /var/cache/conftool/dbconfig/20240125-071747-root.json
07:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 75%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55636 and previous config saved to /var/cache/conftool/dbconfig/20240125-071739-root.json
07:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55635 and previous config saved to /var/cache/conftool/dbconfig/20240125-071259-marostegui.json
07:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 db2160 db2109 db2107 db2137:3314 db2135:3315 db2143 db2147 db2177 db2178 db2188 T355549', diff saved to https://phabricator.wikimedia.org/P55634 and previous config saved to /var/cache/conftool/dbconfig/20240125-071253-marostegui.json
07:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2107 T355682', diff saved to https://phabricator.wikimedia.org/P55633 and previous config saved to /var/cache/conftool/dbconfig/20240125-070604-marostegui.json
07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55632 and previous config saved to /var/cache/conftool/dbconfig/20240125-070308-root.json
07:03 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55631 and previous config saved to /var/cache/conftool/dbconfig/20240125-070300-root.json
07:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55630 and previous config saved to /var/cache/conftool/dbconfig/20240125-070242-root.json
07:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 50%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55629 and previous config saved to /var/cache/conftool/dbconfig/20240125-070234-root.json
07:01 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2104 to s2 primary and set section read-write T355682', diff saved to https://phabricator.wikimedia.org/P55628 and previous config saved to /var/cache/conftool/dbconfig/20240125-070153-marostegui.json
07:01 marostegui@cumin1002: dbctl commit (dc=all): 'Set s2 codfw as read-only for maintenance - T355682', diff saved to https://phabricator.wikimedia.org/P55627 and previous config saved to /var/cache/conftool/dbconfig/20240125-070120-marostegui.json
07:00 marostegui: Starting s2 codfw failover from db2107 to db2104 - T355682
06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T354336)', diff saved to https://phabricator.wikimedia.org/P55626 and previous config saved to /var/cache/conftool/dbconfig/20240125-065535-marostegui.json
06:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55625 and previous config saved to /var/cache/conftool/dbconfig/20240125-064803-root.json
06:47 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55624 and previous config saved to /var/cache/conftool/dbconfig/20240125-064755-root.json
06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55623 and previous config saved to /var/cache/conftool/dbconfig/20240125-064737-root.json
06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 25%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55622 and previous config saved to /var/cache/conftool/dbconfig/20240125-064729-root.json
06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2110 (T354336)', diff saved to https://phabricator.wikimedia.org/P55621 and previous config saved to /var/cache/conftool/dbconfig/20240125-064420-marostegui.json
06:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2110.codfw.wmnet with reason: Maintenance
06:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2110.codfw.wmnet with reason: Maintenance
06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T354336)', diff saved to https://phabricator.wikimedia.org/P55620 and previous config saved to /var/cache/conftool/dbconfig/20240125-064357-marostegui.json
06:37 marostegui@deploy2002: Finished scap: Backport for ProductionServices.php: Promote pc2014 (T355683) (duration: 08m 42s)
06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55619 and previous config saved to /var/cache/conftool/dbconfig/20240125-063258-root.json
06:32 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55618 and previous config saved to /var/cache/conftool/dbconfig/20240125-063250-root.json
06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55617 and previous config saved to /var/cache/conftool/dbconfig/20240125-063232-root.json
06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 10%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55616 and previous config saved to /var/cache/conftool/dbconfig/20240125-063225-root.json
06:31 marostegui@deploy2002: marostegui: Continuing with sync
06:31 marostegui@deploy2002: marostegui: Backport for ProductionServices.php: Promote pc2014 (T355683) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
06:29 marostegui@deploy2002: Started scap: Backport for ProductionServices.php: Promote pc2014 (T355683)
06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55615 and previous config saved to /var/cache/conftool/dbconfig/20240125-062851-marostegui.json
06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55614 and previous config saved to /var/cache/conftool/dbconfig/20240125-061753-root.json
06:17 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55613 and previous config saved to /var/cache/conftool/dbconfig/20240125-061745-root.json
06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55612 and previous config saved to /var/cache/conftool/dbconfig/20240125-061727-root.json
06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 5%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55611 and previous config saved to /var/cache/conftool/dbconfig/20240125-061719-root.json
06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55610 and previous config saved to /var/cache/conftool/dbconfig/20240125-061344-marostegui.json
06:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s2 T355682
06:10 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2104 with weight 0 T355682', diff saved to https://phabricator.wikimedia.org/P55609 and previous config saved to /var/cache/conftool/dbconfig/20240125-061048-root.json
06:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s2 T355682
06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2136 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55608 and previous config saved to /var/cache/conftool/dbconfig/20240125-060249-root.json
06:02 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55607 and previous config saved to /var/cache/conftool/dbconfig/20240125-060240-root.json
06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2157 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55606 and previous config saved to /var/cache/conftool/dbconfig/20240125-060222-root.json
06:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P55605 and previous config saved to /var/cache/conftool/dbconfig/20240125-060214-root.json
05:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T354336)', diff saved to https://phabricator.wikimedia.org/P55604 and previous config saved to /var/cache/conftool/dbconfig/20240125-055837-marostegui.json
05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2106 (T354336)', diff saved to https://phabricator.wikimedia.org/P55603 and previous config saved to /var/cache/conftool/dbconfig/20240125-055626-marostegui.json
05:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2106.codfw.wmnet with reason: Maintenance
05:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2106.codfw.wmnet with reason: Maintenance
05:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2099.codfw.wmnet with reason: Maintenance
05:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2099.codfw.wmnet with reason: Maintenance
05:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1160.eqiad.wmnet with reason: Maintenance
05:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1160.eqiad.wmnet with reason: Maintenance
02:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
02:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
02:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T354336)', diff saved to https://phabricator.wikimedia.org/P55602 and previous config saved to /var/cache/conftool/dbconfig/20240125-022727-marostegui.json
02:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55601 and previous config saved to /var/cache/conftool/dbconfig/20240125-021221-marostegui.json
01:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55600 and previous config saved to /var/cache/conftool/dbconfig/20240125-015714-marostegui.json
01:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T354336)', diff saved to https://phabricator.wikimedia.org/P55599 and previous config saved to /var/cache/conftool/dbconfig/20240125-014208-marostegui.json
01:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1249 (T354336)', diff saved to https://phabricator.wikimedia.org/P55598 and previous config saved to /var/cache/conftool/dbconfig/20240125-013958-marostegui.json
01:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1249.eqiad.wmnet with reason: Maintenance
01:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1249.eqiad.wmnet with reason: Maintenance
01:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T354336)', diff saved to https://phabricator.wikimedia.org/P55597 and previous config saved to /var/cache/conftool/dbconfig/20240125-013936-marostegui.json
01:28 fab@deploy2002: Finished deploy [airflow-dags/research@e6aa85a]: (no justification provided) (duration: 00m 13s)
01:28 fab@deploy2002: Started deploy [airflow-dags/research@e6aa85a]: (no justification provided)
01:25 eileen: civicrm upgraded from b85b6dde to 69d4ebe3
01:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55596 and previous config saved to /var/cache/conftool/dbconfig/20240125-012430-marostegui.json
01:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55595 and previous config saved to /var/cache/conftool/dbconfig/20240125-010923-marostegui.json
00:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T354336)', diff saved to https://phabricator.wikimedia.org/P55594 and previous config saved to /var/cache/conftool/dbconfig/20240125-005417-marostegui.json
00:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1248 (T354336)', diff saved to https://phabricator.wikimedia.org/P55593 and previous config saved to /var/cache/conftool/dbconfig/20240125-005307-marostegui.json
00:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1248.eqiad.wmnet with reason: Maintenance
00:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1248.eqiad.wmnet with reason: Maintenance
00:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T354336)', diff saved to https://phabricator.wikimedia.org/P55592 and previous config saved to /var/cache/conftool/dbconfig/20240125-005245-marostegui.json
00:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55591 and previous config saved to /var/cache/conftool/dbconfig/20240125-003739-marostegui.json
00:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55590 and previous config saved to /var/cache/conftool/dbconfig/20240125-002233-marostegui.json
00:12 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2103.codfw.wmnet with OS bullseye
00:12 zabe@deploy2002: Finished scap: Backport for Start reading from af_user(_text)/afh_user(_text) in testwiki (T355616) (duration: 09m 36s)
00:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T354336)', diff saved to https://phabricator.wikimedia.org/P55589 and previous config saved to /var/cache/conftool/dbconfig/20240125-000726-marostegui.json
00:05 zabe@deploy2002: zabe: Continuing with sync
00:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1247 (T354336)', diff saved to https://phabricator.wikimedia.org/P55588 and previous config saved to /var/cache/conftool/dbconfig/20240125-000515-marostegui.json
00:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1247.eqiad.wmnet with reason: Maintenance
00:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1247.eqiad.wmnet with reason: Maintenance
00:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T354336)', diff saved to https://phabricator.wikimedia.org/P55587 and previous config saved to /var/cache/conftool/dbconfig/20240125-000452-marostegui.json
00:04 zabe@deploy2002: zabe: Backport for Start reading from af_user(_text)/afh_user(_text) in testwiki (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
00:02 zabe@deploy2002: Started scap: Backport for Start reading from af_user(_text)/afh_user(_text) in testwiki (T355616)

2024-01-24

23:54 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2103.codfw.wmnet with reason: host reimage
23:51 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2103.codfw.wmnet with reason: host reimage
23:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55586 and previous config saved to /var/cache/conftool/dbconfig/20240124-234946-marostegui.json
23:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55585 and previous config saved to /var/cache/conftool/dbconfig/20240124-233439-marostegui.json
23:34 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2103.codfw.wmnet with OS bullseye
23:33 jforrester@deploy2002: Finished scap: Backport for Revert "Update
spacing to improve consistency of ul/ol spacing, also update heading spacing to be more consistent, relying on mw defaults more" (T355805 T354433) (duration: 13m 29s)
23:32 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2105.codfw.wmnet with OS bullseye
23:32 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2104.codfw.wmnet with OS bullseye
23:26 jforrester@deploy2002: jforrester: Continuing with sync
23:21 jforrester@deploy2002: jforrester: Backport for Revert "Update
spacing to improve consistency of ul/ol spacing, also update heading spacing to be more consistent, relying on mw defaults more" (T355805 T354433) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
23:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T354336)', diff saved to https://phabricator.wikimedia.org/P55584 and previous config saved to /var/cache/conftool/dbconfig/20240124-231933-marostegui.json
23:19 jforrester@deploy2002: Started scap: Backport for Revert "Update
spacing to improve consistency of ul/ol spacing, also update heading spacing to be more consistent, relying on mw defaults more" (T355805 T354433)
23:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1243 (T354336)', diff saved to https://phabricator.wikimedia.org/P55583 and previous config saved to /var/cache/conftool/dbconfig/20240124-231723-marostegui.json
23:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1243.eqiad.wmnet with reason: Maintenance
23:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1243.eqiad.wmnet with reason: Maintenance
23:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T354336)', diff saved to https://phabricator.wikimedia.org/P55582 and previous config saved to /var/cache/conftool/dbconfig/20240124-231701-marostegui.json
23:04 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2103.codfw.wmnet with OS bullseye
23:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55581 and previous config saved to /var/cache/conftool/dbconfig/20240124-230155-marostegui.json
22:50 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2106.codfw.wmnet with OS bullseye
22:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55580 and previous config saved to /var/cache/conftool/dbconfig/20240124-224648-marostegui.json
22:39 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: cloduelastic maintenance
22:39 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: cloduelastic maintenance
22:33 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2106.codfw.wmnet with reason: host reimage
22:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T354336)', diff saved to https://phabricator.wikimedia.org/P55579 and previous config saved to /var/cache/conftool/dbconfig/20240124-223142-marostegui.json
22:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1242 (T354336)', diff saved to https://phabricator.wikimedia.org/P55578 and previous config saved to /var/cache/conftool/dbconfig/20240124-222932-marostegui.json
22:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1242.eqiad.wmnet with reason: Maintenance
22:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1242.eqiad.wmnet with reason: Maintenance
22:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T354336)', diff saved to https://phabricator.wikimedia.org/P55577 and previous config saved to /var/cache/conftool/dbconfig/20240124-222910-marostegui.json
22:28 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2106.codfw.wmnet with reason: host reimage
22:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P55576 and previous config saved to /var/cache/conftool/dbconfig/20240124-221403-marostegui.json
22:11 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2105.codfw.wmnet with OS bullseye
22:11 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2106.codfw.wmnet with OS bullseye
22:11 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2104.codfw.wmnet with OS bullseye
22:10 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2103.codfw.wmnet with OS bullseye
21:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P55575 and previous config saved to /var/cache/conftool/dbconfig/20240124-215857-marostegui.json
21:45 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase[2022-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T354336)', diff saved to https://phabricator.wikimedia.org/P55574 and previous config saved to /var/cache/conftool/dbconfig/20240124-214351-marostegui.json
21:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1241 (T354336)', diff saved to https://phabricator.wikimedia.org/P55573 and previous config saved to /var/cache/conftool/dbconfig/20240124-214141-marostegui.json
21:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1241.eqiad.wmnet with reason: Maintenance
21:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1241.eqiad.wmnet with reason: Maintenance
21:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T354336)', diff saved to https://phabricator.wikimedia.org/P55572 and previous config saved to /var/cache/conftool/dbconfig/20240124-214120-marostegui.json
21:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P55571 and previous config saved to /var/cache/conftool/dbconfig/20240124-212613-marostegui.json
21:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P55570 and previous config saved to /var/cache/conftool/dbconfig/20240124-211107-marostegui.json
21:05 aqu@deploy2002: Finished deploy [airflow-dags/analytics@5a0681b]: Regular analytics weekly train [airflow-dags/analytics@5a0681bc] (duration: 00m 37s)
21:05 aqu@deploy2002: Started deploy [airflow-dags/analytics@5a0681b]: Regular analytics weekly train [airflow-dags/analytics@5a0681bc]
20:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T354336)', diff saved to https://phabricator.wikimedia.org/P55569 and previous config saved to /var/cache/conftool/dbconfig/20240124-205600-marostegui.json
20:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1238 (T354336)', diff saved to https://phabricator.wikimedia.org/P55568 and previous config saved to /var/cache/conftool/dbconfig/20240124-205350-marostegui.json
20:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1238.eqiad.wmnet with reason: Maintenance
20:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1238.eqiad.wmnet with reason: Maintenance
20:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T354336)', diff saved to https://phabricator.wikimedia.org/P55567 and previous config saved to /var/cache/conftool/dbconfig/20240124-205327-marostegui.json
20:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P55566 and previous config saved to /var/cache/conftool/dbconfig/20240124-203821-marostegui.json
20:38 fab@deploy2002: Finished deploy [airflow-dags/research@2f514fc]: (no justification provided) (duration: 00m 33s)
20:37 fab@deploy2002: Started deploy [airflow-dags/research@2f514fc]: (no justification provided)
20:26 zabe: zabe@mwmaint2002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=scowiki --logwiki=metawiki 'TheBabushka' 'AshotGPT' # T355743
20:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P55565 and previous config saved to /var/cache/conftool/dbconfig/20240124-202315-marostegui.json
20:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T354336)', diff saved to https://phabricator.wikimedia.org/P55564 and previous config saved to /var/cache/conftool/dbconfig/20240124-200808-marostegui.json
20:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1221 (T354336)', diff saved to https://phabricator.wikimedia.org/P55563 and previous config saved to /var/cache/conftool/dbconfig/20240124-200659-marostegui.json
20:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
20:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
20:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1221.eqiad.wmnet with reason: Maintenance
20:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1221.eqiad.wmnet with reason: Maintenance
20:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T354336)', diff saved to https://phabricator.wikimedia.org/P55562 and previous config saved to /var/cache/conftool/dbconfig/20240124-200619-marostegui.json
20:02 cstone: payments-wiki upgraded from a3691a8e to 8cfbbb4b
19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P55561 and previous config saved to /var/cache/conftool/dbconfig/20240124-195113-marostegui.json
19:39 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:38 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P55560 and previous config saved to /var/cache/conftool/dbconfig/20240124-193606-marostegui.json
19:35 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:34 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
19:34 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
19:24 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:23 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
19:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T354336)', diff saved to https://phabricator.wikimedia.org/P55559 and previous config saved to /var/cache/conftool/dbconfig/20240124-192100-marostegui.json
19:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1199 (T354336)', diff saved to https://phabricator.wikimedia.org/P55558 and previous config saved to /var/cache/conftool/dbconfig/20240124-191850-marostegui.json
19:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1199.eqiad.wmnet with reason: Maintenance
19:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1199.eqiad.wmnet with reason: Maintenance
19:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55557 and previous config saved to /var/cache/conftool/dbconfig/20240124-191828-marostegui.json
19:16 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2022-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
19:13 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase[2017-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
19:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P55555 and previous config saved to /var/cache/conftool/dbconfig/20240124-190322-marostegui.json
18:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P55554 and previous config saved to /var/cache/conftool/dbconfig/20240124-184815-marostegui.json
18:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55553 and previous config saved to /var/cache/conftool/dbconfig/20240124-183308-marostegui.json
18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55552 and previous config saved to /var/cache/conftool/dbconfig/20240124-183059-marostegui.json
18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1190.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1190.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1149.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1149.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55551 and previous config saved to /var/cache/conftool/dbconfig/20240124-183001-marostegui.json
18:24 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2017-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
18:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P55550 and previous config saved to /var/cache/conftool/dbconfig/20240124-181455-marostegui.json
18:09 mfossati@deploy2002: Finished deploy [airflow-dags/platform_eng@fed6de3]: (no justification provided) (duration: 00m 32s)
18:08 mfossati@deploy2002: Started deploy [airflow-dags/platform_eng@fed6de3]: (no justification provided)
17:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P55549 and previous config saved to /var/cache/conftool/dbconfig/20240124-175948-marostegui.json
17:50 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
17:50 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
17:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
17:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
17:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55548 and previous config saved to /var/cache/conftool/dbconfig/20240124-174442-marostegui.json
17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1146:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55547 and previous config saved to /var/cache/conftool/dbconfig/20240124-174332-marostegui.json
17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
17:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
17:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55546 and previous config saved to /var/cache/conftool/dbconfig/20240124-174251-marostegui.json
17:35 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase[2015-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P55545 and previous config saved to /var/cache/conftool/dbconfig/20240124-172745-marostegui.json
17:24 hashar@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.15 refs T354433 (duration: 07m 10s)
17:17 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2015-2035].codfw.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
17:16 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.15 refs T354433
17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P55544 and previous config saved to /var/cache/conftool/dbconfig/20240124-171238-marostegui.json
17:10 sukhe: sudo cumin -b1 -s60 "R:Class = Bird" "enable-puppet 'CR991699' && run-puppet-agent"
17:09 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase103[1-3].eqiad.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
17:06 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@16476a9] (releasing): (no justification provided) (duration: 01m 07s)
17:06 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@16476a9] (releasing): (no justification provided)
17:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2053.codfw.wmnet
17:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1053.eqiad.wmnet
16:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2053.codfw.wmnet
16:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1053.eqiad.wmnet
16:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55543 and previous config saved to /var/cache/conftool/dbconfig/20240124-165732-marostegui.json
16:56 vgutierrez: enable puppet on cp3066 - T354424
16:55 sukhe: enable puppet on durum1001 to test CR 991699
16:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1144:3314 (T354336)', diff saved to https://phabricator.wikimedia.org/P55542 and previous config saved to /var/cache/conftool/dbconfig/20240124-165522-marostegui.json
16:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
16:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
16:54 XioNoX: disable puppet on all the hosts running bird to deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/991699
16:39 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase103[1-3].eqiad.wmnet: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
16:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2103.codfw.wmnet with reason: Maintenance
16:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2103.codfw.wmnet with reason: Maintenance
16:30 eevans@cumin1002: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching A:restbase-eqiad: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
16:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T354336)', diff saved to https://phabricator.wikimedia.org/P55541 and previous config saved to /var/cache/conftool/dbconfig/20240124-162532-marostegui.json
16:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P55540 and previous config saved to /var/cache/conftool/dbconfig/20240124-161026-marostegui.json
16:04 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
16:04 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
16:03 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
16:03 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
15:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host phab2002.codfw.wmnet
15:57 hashar@deploy2002: Synchronized php-1.42.0-wmf.15/extensions/Echo/includes/Formatters/EchoRevertedPresentationModel.php: Fix EchoRevertedPresentationModel using null as string - T355751 (duration: 09m 06s)
15:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P55539 and previous config saved to /var/cache/conftool/dbconfig/20240124-155519-marostegui.json
15:50 vgutierrez: disable puppet on cp3066 - T354424
15:48 sukhe: sudo cumin -b1 -s120 'A:dns-rec' "enable-puppet 'merging CR 980929' && run-puppet-agent"
15:47 hashar@deploy2002: Synchronized php-1.42.0-wmf.15/extensions/CentralAuth/tests/phpunit/CentralAuthIdLookupTest.php: Fix CentralIdLookup tests (duration: 11m 18s)
15:45 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2446.codfw.wmnet with OS bullseye
15:42 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2430.codfw.wmnet with OS bullseye
15:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T354336)', diff saved to https://phabricator.wikimedia.org/P55538 and previous config saved to /var/cache/conftool/dbconfig/20240124-154013-marostegui.json
15:39 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2427.codfw.wmnet with OS bullseye
15:38 sukhe: sudo cumin 'A:dns-rec' "disable-puppet 'merging CR 980929'"
15:38 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
15:38 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
15:38 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
15:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T354336)', diff saved to https://phabricator.wikimedia.org/P55537 and previous config saved to /var/cache/conftool/dbconfig/20240124-153752-marostegui.json
15:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2188.codfw.wmnet with reason: Maintenance
15:37 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
15:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2188.codfw.wmnet with reason: Maintenance
15:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T354336)', diff saved to https://phabricator.wikimedia.org/P55536 and previous config saved to /var/cache/conftool/dbconfig/20240124-153730-marostegui.json
15:37 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host phab2002.codfw.wmnet
15:37 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
15:36 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
15:32 moritzm: imported jenkins 2.426.3 for buster/bullseye T355503
15:25 aqu@deploy2002: Finished deploy [airflow-dags/analytics@da2e61c]: Regular analytics weekly train [airflow-dags/analytics@da2e61c7] (duration: 00m 42s)
15:25 aqu@deploy2002: Started deploy [airflow-dags/analytics@da2e61c]: Regular analytics weekly train [airflow-dags/analytics@da2e61c7]
15:25 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2446.codfw.wmnet with reason: host reimage
15:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P55534 and previous config saved to /var/cache/conftool/dbconfig/20240124-152224-marostegui.json
15:22 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2430.codfw.wmnet with reason: host reimage
15:21 aqu: Refinery weekly deployment train - end (scap, then deployed onto hdfs) (test cluster deploy still broken T354703)
15:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2427.codfw.wmnet with reason: host reimage
15:17 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2430.codfw.wmnet with reason: host reimage
15:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2446.codfw.wmnet with reason: host reimage
15:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2427.codfw.wmnet with reason: host reimage
15:12 aqu@deploy2002: Finished deploy [analytics/refinery@13f7a06] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@13f7a06c] (duration: 03m 28s)
15:11 moritzm: uploading pymsql 1.0.2-2~wmf11u1 to apt.wikimedia.org T355531
15:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2055.codfw.wmnet
15:08 aqu@deploy2002: Started deploy [analytics/refinery@13f7a06] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@13f7a06c]
15:08 aqu@deploy2002: Finished deploy [analytics/refinery@13f7a06] (thin): Regular analytics weekly train THIN [analytics/refinery@13f7a06c] (duration: 00m 05s)
15:08 aqu@deploy2002: Started deploy [analytics/refinery@13f7a06] (thin): Regular analytics weekly train THIN [analytics/refinery@13f7a06c]
15:07 aqu@deploy2002: Finished deploy [analytics/refinery@13f7a06]: Regular analytics weekly train [analytics/refinery@13f7a06c] (duration: 10m 12s)
15:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P55533 and previous config saved to /var/cache/conftool/dbconfig/20240124-150718-marostegui.json
15:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2055.codfw.wmnet
14:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2446.codfw.wmnet with OS bullseye
14:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2430.codfw.wmnet with OS bullseye
14:59 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2427.codfw.wmnet with OS bullseye
14:57 aqu@deploy2002: Started deploy [analytics/refinery@13f7a06]: Regular analytics weekly train [analytics/refinery@13f7a06c]
14:57 akosiaris@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
14:57 akosiaris@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
14:56 akosiaris@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
14:56 akosiaris@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
14:56 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
14:56 aqu@deploy2002: Finished deploy [analytics/refinery@d1ee04c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d1ee04cc] (duration: 03m 40s)
14:56 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
14:55 akosiaris: bump eventrouter limits/requests memory/cpu
14:55 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
14:55 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['elastic2094.codfw.wmnet']
14:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
14:52 aqu@deploy2002: Started deploy [analytics/refinery@d1ee04c] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d1ee04cc]
14:52 aqu@deploy2002: Finished deploy [analytics/refinery@d1ee04c] (thin): Regular analytics weekly train THIN [analytics/refinery@d1ee04cc] (duration: 00m 06s)
14:52 aqu@deploy2002: Started deploy [analytics/refinery@d1ee04c] (thin): Regular analytics weekly train THIN [analytics/refinery@d1ee04cc]
14:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T354336)', diff saved to https://phabricator.wikimedia.org/P55532 and previous config saved to /var/cache/conftool/dbconfig/20240124-145211-marostegui.json
14:51 Lucas_WMDE: UTC afternoon backport+config window done
14:50 aqu@deploy2002: Finished deploy [analytics/refinery@d1ee04c]: Regular analytics weekly train [analytics/refinery@d1ee04cc] (duration: 09m 11s)
14:50 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for cswiki: remove unused birthday logo files (duration: 09m 36s)
14:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T354336)', diff saved to https://phabricator.wikimedia.org/P55531 and previous config saved to /var/cache/conftool/dbconfig/20240124-144947-marostegui.json
14:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2176.codfw.wmnet with reason: Maintenance
14:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2176.codfw.wmnet with reason: Maintenance
14:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T354336)', diff saved to https://phabricator.wikimedia.org/P55530 and previous config saved to /var/cache/conftool/dbconfig/20240124-144925-marostegui.json
14:47 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2054.codfw.wmnet
14:44 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
14:43 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for cswiki: remove unused birthday logo files synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:41 aqu@deploy2002: Started deploy [analytics/refinery@d1ee04c]: Regular analytics weekly train [analytics/refinery@d1ee04cc]
14:41 aqu@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
14:41 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for cswiki: remove unused birthday logo files
14:40 aqu@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
14:39 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [azwiki] Add new namespace aliases (T355041) (duration: 10m 00s)
14:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2054.codfw.wmnet
14:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1054.eqiad.wmnet
14:37 aqu@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
14:36 aqu@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
14:36 aqu@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
14:35 aqu@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
14:35 aqu@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
14:35 aqu@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
14:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
14:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P55529 and previous config saved to /var/cache/conftool/dbconfig/20240124-143419-marostegui.json
14:34 aqu@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
14:33 aqu@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
14:33 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1054.eqiad.wmnet
14:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
14:31 aqu: analytics/refinery weekly deployment train - begin
14:31 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2052.codfw.wmnet
14:31 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1052.eqiad.wmnet
14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [azwiki] Add new namespace aliases (T355041) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:29 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Updated Cassandra to 4.1.1-wmf1 — T355719 - eevans@cumin1002
14:29 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [azwiki] Add new namespace aliases (T355041)
14:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [ganwiki] Change autoconfirmed setting (T355126) (duration: 09m 51s)
14:26 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2094.codfw.wmnet']
14:25 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['elastic2094.codfw.wmnet']
14:25 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2094.codfw.wmnet']
14:25 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2094.codfw.wmnet']
14:25 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2094.codfw.wmnet']
14:25 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2052.codfw.wmnet
14:25 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1052.eqiad.wmnet
14:25 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
14:24 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
14:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Continuing with sync
14:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P55527 and previous config saved to /var/cache/conftool/dbconfig/20240124-141912-marostegui.json
14:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and superpes: Backport for [ganwiki] Change autoconfirmed setting (T355126) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:17 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [ganwiki] Change autoconfirmed setting (T355126)
14:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798) (duration: 10m 52s)
14:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2053.codfw.wmnet
14:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Continuing with sync
14:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 wmde-fisch and lucaswerkmeister-wmde: Backport for Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2053.codfw.wmnet
14:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T354336)', diff saved to https://phabricator.wikimedia.org/P55526 and previous config saved to /var/cache/conftool/dbconfig/20240124-140406-marostegui.json
14:04 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ml-serve2005.codfw.wmnet with reason: Machine move (T355437)
14:04 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Add mediawiki.reference_previews to wgEventLoggingStreamNames (T353798)
14:03 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ml-serve2005.codfw.wmnet with reason: Machine move (T355437)
14:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T354336)', diff saved to https://phabricator.wikimedia.org/P55525 and previous config saved to /var/cache/conftool/dbconfig/20240124-140142-marostegui.json
14:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2174.codfw.wmnet with reason: Maintenance
14:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2174.codfw.wmnet with reason: Maintenance
14:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T354336)', diff saved to https://phabricator.wikimedia.org/P55524 and previous config saved to /var/cache/conftool/dbconfig/20240124-140120-marostegui.json
13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1053.eqiad.wmnet
13:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: After switchover', diff saved to https://phabricator.wikimedia.org/P55523 and previous config saved to /var/cache/conftool/dbconfig/20240124-135424-root.json
13:50 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1053.eqiad.wmnet
13:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P55522 and previous config saved to /var/cache/conftool/dbconfig/20240124-134614-marostegui.json
13:39 samtar@deploy2002: Finished scap: Backport for Added Diff to approved list of RSS feeds for Foundation Governance Wiki and removed inoperative feed. (T354790) (duration: 09m 14s)
13:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: After switchover', diff saved to https://phabricator.wikimedia.org/P55521 and previous config saved to /var/cache/conftool/dbconfig/20240124-133919-root.json
13:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2051.codfw.wmnet
13:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1051.eqiad.wmnet
13:32 samtar@deploy2002: samtar and varnent: Continuing with sync
13:32 samtar@deploy2002: samtar and varnent: Backport for Added Diff to approved list of RSS feeds for Foundation Governance Wiki and removed inoperative feed. (T354790) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
13:31 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1051.eqiad.wmnet
13:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P55520 and previous config saved to /var/cache/conftool/dbconfig/20240124-133107-marostegui.json
13:31 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2051.codfw.wmnet
13:30 samtar@deploy2002: Started scap: Backport for Added Diff to approved list of RSS feeds for Foundation Governance Wiki and removed inoperative feed. (T354790)
13:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 50%: After switchover', diff saved to https://phabricator.wikimedia.org/P55519 and previous config saved to /var/cache/conftool/dbconfig/20240124-132414-root.json
13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T354336)', diff saved to https://phabricator.wikimedia.org/P55518 and previous config saved to /var/cache/conftool/dbconfig/20240124-131600-marostegui.json
13:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 25%: After switchover', diff saved to https://phabricator.wikimedia.org/P55517 and previous config saved to /var/cache/conftool/dbconfig/20240124-130909-root.json
12:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 10%: After switchover', diff saved to https://phabricator.wikimedia.org/P55516 and previous config saved to /var/cache/conftool/dbconfig/20240124-125404-root.json
12:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2052.codfw.wmnet
12:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P55515 and previous config saved to /var/cache/conftool/dbconfig/20240124-123859-root.json
12:34 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2052.codfw.wmnet
12:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1052.eqiad.wmnet
12:28 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1052.eqiad.wmnet
12:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P55514 and previous config saved to /var/cache/conftool/dbconfig/20240124-122354-root.json
12:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1231 T355760', diff saved to https://phabricator.wikimedia.org/P55513 and previous config saved to /var/cache/conftool/dbconfig/20240124-122148-root.json
12:20 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1173 to s6 primary T355760', diff saved to https://phabricator.wikimedia.org/P55512 and previous config saved to /var/cache/conftool/dbconfig/20240124-122030-marostegui.json
12:19 marostegui: Starting s6 eqiad failover from db1231 to db1173 - T355760
12:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
12:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
12:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2173.codfw.wmnet with reason: Maintenance
12:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2173.codfw.wmnet with reason: Maintenance
12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55510 and previous config saved to /var/cache/conftool/dbconfig/20240124-121448-marostegui.json
12:07 ladsgroup@deploy2002: Finished scap: Backport for GenerateFancyCaptchas: Add ->disableSandbox() to shell command (duration: 09m 55s)
12:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355760
12:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355760
12:00 ladsgroup@deploy2002: ladsgroup: Continuing with sync
11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P55509 and previous config saved to /var/cache/conftool/dbconfig/20240124-115942-marostegui.json
11:58 ladsgroup@deploy2002: ladsgroup: Backport for GenerateFancyCaptchas: Add ->disableSandbox() to shell command synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:58 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host acmechief-test1001.eqiad.wmnet
11:57 ladsgroup@deploy2002: Started scap: Backport for GenerateFancyCaptchas: Add ->disableSandbox() to shell command
11:57 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
11:56 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
11:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2050.codfw.wmnet
11:55 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host acmechief-test2001.codfw.wmnet
11:55 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1050.eqiad.wmnet
11:54 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
11:52 hnowlan@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
11:52 hnowlan@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
11:49 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2050.codfw.wmnet
11:49 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1050.eqiad.wmnet
11:47 hnowlan@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
11:46 hnowlan@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P55506 and previous config saved to /var/cache/conftool/dbconfig/20240124-114435-marostegui.json
11:43 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host acmechief-test2001.codfw.wmnet
11:33 vgutierrez: repool cp3066 - T354424
11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1014.eqiad.wmnet with OS bullseye
11:32 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
11:32 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
11:31 vgutierrez: depooling cp3066 - T354424
11:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55505 and previous config saved to /var/cache/conftool/dbconfig/20240124-112929-marostegui.json
11:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55504 and previous config saved to /var/cache/conftool/dbconfig/20240124-112705-marostegui.json
11:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
11:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55503 and previous config saved to /var/cache/conftool/dbconfig/20240124-112643-marostegui.json
11:26 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
11:26 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
11:24 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
11:24 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
11:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P55501 and previous config saved to /var/cache/conftool/dbconfig/20240124-111136-marostegui.json
11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
10:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
10:57 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=rowikinews --fix # T350889
10:57 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1173 with weight 0 T355760', diff saved to https://phabricator.wikimedia.org/P55500 and previous config saved to /var/cache/conftool/dbconfig/20240124-105702-root.json
10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P55499 and previous config saved to /var/cache/conftool/dbconfig/20240124-105630-marostegui.json
10:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1014.eqiad.wmnet with OS bullseye
10:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1014.eqiad.wmnet
10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1017.eqiad.wmnet with OS bullseye
10:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55498 and previous config saved to /var/cache/conftool/dbconfig/20240124-104123-marostegui.json
10:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167:3311 (T354336)', diff saved to https://phabricator.wikimedia.org/P55497 and previous config saved to /var/cache/conftool/dbconfig/20240124-103900-marostegui.json
10:38 hashar: deployment-server: removing `gerrit` remove from `/srv/mediawiki-staging` given it is tied to a specific username and the `origin` remote already has ssh protocol for push # ping James_F
10:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
10:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T354336)', diff saved to https://phabricator.wikimedia.org/P55496 and previous config saved to /var/cache/conftool/dbconfig/20240124-103837-marostegui.json
10:37 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1014.eqiad.wmnet
10:36 moritzm: upgrading cumin1002 to pymsql 1.0.2-2~wmf11u1 T355531
10:31 hashar@deploy2002: rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.42.0-wmf.15" - T354433
10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P55495 and previous config saved to /var/cache/conftool/dbconfig/20240124-102330-marostegui.json
10:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
10:10 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
10:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P55494 and previous config saved to /var/cache/conftool/dbconfig/20240124-100824-marostegui.json
10:00 vgutierrez: repool cp3066 - T354424
09:58 vgutierrez: depooling cp3066 - T354424
09:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1017.eqiad.wmnet with OS bullseye
09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T354336)', diff saved to https://phabricator.wikimedia.org/P55493 and previous config saved to /var/cache/conftool/dbconfig/20240124-095317-marostegui.json
09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T354336)', diff saved to https://phabricator.wikimedia.org/P55492 and previous config saved to /var/cache/conftool/dbconfig/20240124-095054-marostegui.json
09:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2153.codfw.wmnet with reason: Maintenance
09:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2153.codfw.wmnet with reason: Maintenance
09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T354336)', diff saved to https://phabricator.wikimedia.org/P55491 and previous config saved to /var/cache/conftool/dbconfig/20240124-095032-marostegui.json
09:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: A1 codfw maintenance
09:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: A1 codfw maintenance
09:49 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1037.eqiad.wmnet to cluster eqiad and group C
09:41 ayounsi@cumin2002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f8-eqiad
09:41 ayounsi@cumin2002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
09:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P55489 and previous config saved to /var/cache/conftool/dbconfig/20240124-093526-marostegui.json
09:32 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f8-eqiad
09:32 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
09:31 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1037.eqiad.wmnet to cluster eqiad and group C
09:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: A1 codfw maintenance T355437
09:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: A1 codfw maintenance T355437
09:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: A1 codfw maintenance T355437
09:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: A1 codfw maintenance T355437
09:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: A1 codfw maintenance T355437
09:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: A1 codfw maintenance T355437
09:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2026.codfw.wmnet with reason: A1 codfw maintenance T355437
09:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2026.codfw.wmnet with reason: A1 codfw maintenance T355437
09:27 hashar@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.15 refs T354433 (duration: 06m 55s)
09:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P55488 and previous config saved to /var/cache/conftool/dbconfig/20240124-092019-marostegui.json
09:20 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.15 refs T354433
09:08 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ganeti1037.eqiad.wmnet
09:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T354336)', diff saved to https://phabricator.wikimedia.org/P55487 and previous config saved to /var/cache/conftool/dbconfig/20240124-090512-marostegui.json
09:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2146 (T354336)', diff saved to https://phabricator.wikimedia.org/P55486 and previous config saved to /var/cache/conftool/dbconfig/20240124-090250-marostegui.json
09:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2146.codfw.wmnet with reason: Maintenance
09:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2146.codfw.wmnet with reason: Maintenance
09:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T354336)', diff saved to https://phabricator.wikimedia.org/P55485 and previous config saved to /var/cache/conftool/dbconfig/20240124-090228-marostegui.json
08:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
08:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P55484 and previous config saved to /var/cache/conftool/dbconfig/20240124-084721-marostegui.json
08:45 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti1037.eqiad.wmnet
08:36 hashar@deploy2002: Finished scap: Backport for Use a class for 'LogActionsHandlers' (T355680) (duration: 08m 00s)
08:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P55483 and previous config saved to /var/cache/conftool/dbconfig/20240124-083215-marostegui.json
08:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
08:30 hashar@deploy2002: hashar: Continuing with sync
08:30 hashar@deploy2002: hashar: Backport for Use a class for 'LogActionsHandlers' (T355680) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:28 hashar@deploy2002: Started scap: Backport for Use a class for 'LogActionsHandlers' (T355680)
08:25 logmsgbot: wmde-fisch@deploy2002 Finished scap: Backport for Allow Cite events for reference previews baseline stats (T353798) (duration: 08m 32s)
08:18 logmsgbot: wmde-fisch@deploy2002 wmde-fisch: Continuing with sync
08:18 logmsgbot: wmde-fisch@deploy2002 wmde-fisch: Backport for Allow Cite events for reference previews baseline stats (T353798) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:17 logmsgbot: wmde-fisch@deploy2002 Started scap: Backport for Allow Cite events for reference previews baseline stats (T353798)
08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T354336)', diff saved to https://phabricator.wikimedia.org/P55482 and previous config saved to /var/cache/conftool/dbconfig/20240124-081708-marostegui.json
08:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T354336)', diff saved to https://phabricator.wikimedia.org/P55481 and previous config saved to /var/cache/conftool/dbconfig/20240124-081445-marostegui.json
08:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2145.codfw.wmnet with reason: Maintenance
08:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2145.codfw.wmnet with reason: Maintenance
08:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
08:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
08:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T354336)', diff saved to https://phabricator.wikimedia.org/P55480 and previous config saved to /var/cache/conftool/dbconfig/20240124-081340-marostegui.json
08:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55479 and previous config saved to /var/cache/conftool/dbconfig/20240124-081050-root.json
08:07 logmsgbot: wmde-fisch@deploy2002 wmde-fisch: Backport for Allow Cite events for reference previews baseline stats (T353798) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:05 logmsgbot: wmde-fisch@deploy2002 Started scap: Backport for Allow Cite events for reference previews baseline stats (T353798)
07:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P55478 and previous config saved to /var/cache/conftool/dbconfig/20240124-075834-marostegui.json
07:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55477 and previous config saved to /var/cache/conftool/dbconfig/20240124-075545-root.json
07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P55476 and previous config saved to /var/cache/conftool/dbconfig/20240124-074327-marostegui.json
07:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55475 and previous config saved to /var/cache/conftool/dbconfig/20240124-074040-root.json
07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T354336)', diff saved to https://phabricator.wikimedia.org/P55474 and previous config saved to /var/cache/conftool/dbconfig/20240124-072821-marostegui.json
07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2130 (T354336)', diff saved to https://phabricator.wikimedia.org/P55473 and previous config saved to /var/cache/conftool/dbconfig/20240124-072557-marostegui.json
07:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2130.codfw.wmnet with reason: Maintenance
07:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55472 and previous config saved to /var/cache/conftool/dbconfig/20240124-072535-root.json
07:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2130.codfw.wmnet with reason: Maintenance
07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T354336)', diff saved to https://phabricator.wikimedia.org/P55471 and previous config saved to /var/cache/conftool/dbconfig/20240124-072523-marostegui.json
07:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55470 and previous config saved to /var/cache/conftool/dbconfig/20240124-071954-root.json
07:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55469 and previous config saved to /var/cache/conftool/dbconfig/20240124-071030-root.json
07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P55468 and previous config saved to /var/cache/conftool/dbconfig/20240124-071016-marostegui.json
07:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 75%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55467 and previous config saved to /var/cache/conftool/dbconfig/20240124-070449-root.json
06:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55466 and previous config saved to /var/cache/conftool/dbconfig/20240124-065525-root.json
06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P55465 and previous config saved to /var/cache/conftool/dbconfig/20240124-065510-marostegui.json
06:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 50%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55464 and previous config saved to /var/cache/conftool/dbconfig/20240124-064944-root.json
06:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2129.codfw.wmnet with OS bookworm
06:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55463 and previous config saved to /var/cache/conftool/dbconfig/20240124-064020-root.json
06:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T354336)', diff saved to https://phabricator.wikimedia.org/P55462 and previous config saved to /var/cache/conftool/dbconfig/20240124-064003-marostegui.json
06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T354336)', diff saved to https://phabricator.wikimedia.org/P55461 and previous config saved to /var/cache/conftool/dbconfig/20240124-063739-marostegui.json
06:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2116.codfw.wmnet with reason: Maintenance
06:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2116.codfw.wmnet with reason: Maintenance
06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112 (T354336)', diff saved to https://phabricator.wikimedia.org/P55460 and previous config saved to /var/cache/conftool/dbconfig/20240124-063717-marostegui.json
06:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 25%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55459 and previous config saved to /var/cache/conftool/dbconfig/20240124-063440-root.json
06:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112', diff saved to https://phabricator.wikimedia.org/P55458 and previous config saved to /var/cache/conftool/dbconfig/20240124-062210-marostegui.json
06:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 10%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55457 and previous config saved to /var/cache/conftool/dbconfig/20240124-061934-root.json
06:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2129.codfw.wmnet with reason: host reimage
06:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2129.codfw.wmnet with reason: host reimage
06:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112', diff saved to https://phabricator.wikimedia.org/P55456 and previous config saved to /var/cache/conftool/dbconfig/20240124-060703-marostegui.json
06:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 5%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55455 and previous config saved to /var/cache/conftool/dbconfig/20240124-060429-root.json
05:58 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2129.codfw.wmnet with OS bookworm
05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129 T354506', diff saved to https://phabricator.wikimedia.org/P55454 and previous config saved to /var/cache/conftool/dbconfig/20240124-055635-marostegui.json
05:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2112 (T354336)', diff saved to https://phabricator.wikimedia.org/P55453 and previous config saved to /var/cache/conftool/dbconfig/20240124-055157-marostegui.json
05:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2158 db2157 es2026 db2136 T355437', diff saved to https://phabricator.wikimedia.org/P55452 and previous config saved to /var/cache/conftool/dbconfig/20240124-055143-marostegui.json
05:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2112 (T354336)', diff saved to https://phabricator.wikimedia.org/P55451 and previous config saved to /var/cache/conftool/dbconfig/20240124-054932-marostegui.json
05:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2112.codfw.wmnet with reason: Maintenance
05:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 1%: Repool db2175 after a crash T355489', diff saved to https://phabricator.wikimedia.org/P55450 and previous config saved to /var/cache/conftool/dbconfig/20240124-054924-root.json
05:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2112.codfw.wmnet with reason: Maintenance
05:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2102.codfw.wmnet with reason: Maintenance
05:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2102.codfw.wmnet with reason: Maintenance
05:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1184.eqiad.wmnet with reason: Maintenance
05:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1184.eqiad.wmnet with reason: Maintenance
02:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
02:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
02:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T354336)', diff saved to https://phabricator.wikimedia.org/P55449 and previous config saved to /var/cache/conftool/dbconfig/20240124-023210-marostegui.json
02:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P55448 and previous config saved to /var/cache/conftool/dbconfig/20240124-021704-marostegui.json
02:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P55447 and previous config saved to /var/cache/conftool/dbconfig/20240124-020157-marostegui.json
01:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T354336)', diff saved to https://phabricator.wikimedia.org/P55445 and previous config saved to /var/cache/conftool/dbconfig/20240124-014651-marostegui.json
01:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T354336)', diff saved to https://phabricator.wikimedia.org/P55444 and previous config saved to /var/cache/conftool/dbconfig/20240124-014430-marostegui.json
01:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1234.eqiad.wmnet with reason: Maintenance
01:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1234.eqiad.wmnet with reason: Maintenance
01:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T354336)', diff saved to https://phabricator.wikimedia.org/P55443 and previous config saved to /var/cache/conftool/dbconfig/20240124-014408-marostegui.json
01:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P55442 and previous config saved to /var/cache/conftool/dbconfig/20240124-012902-marostegui.json
01:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P55441 and previous config saved to /var/cache/conftool/dbconfig/20240124-011355-marostegui.json
00:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T354336)', diff saved to https://phabricator.wikimedia.org/P55440 and previous config saved to /var/cache/conftool/dbconfig/20240124-005849-marostegui.json
00:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T354336)', diff saved to https://phabricator.wikimedia.org/P55439 and previous config saved to /var/cache/conftool/dbconfig/20240124-005627-marostegui.json
00:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1232.eqiad.wmnet with reason: Maintenance
00:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1232.eqiad.wmnet with reason: Maintenance
00:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T354336)', diff saved to https://phabricator.wikimedia.org/P55438 and previous config saved to /var/cache/conftool/dbconfig/20240124-005605-marostegui.json
00:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P55437 and previous config saved to /var/cache/conftool/dbconfig/20240124-004058-marostegui.json
00:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P55436 and previous config saved to /var/cache/conftool/dbconfig/20240124-002551-marostegui.json
00:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T354336)', diff saved to https://phabricator.wikimedia.org/P55435 and previous config saved to /var/cache/conftool/dbconfig/20240124-001044-marostegui.json
00:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1228 (T354336)', diff saved to https://phabricator.wikimedia.org/P55434 and previous config saved to /var/cache/conftool/dbconfig/20240124-000824-marostegui.json
00:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1228.eqiad.wmnet with reason: Maintenance
00:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1228.eqiad.wmnet with reason: Maintenance
00:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T354336)', diff saved to https://phabricator.wikimedia.org/P55433 and previous config saved to /var/cache/conftool/dbconfig/20240124-000802-marostegui.json

2024-01-23

23:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P55432 and previous config saved to /var/cache/conftool/dbconfig/20240123-235255-marostegui.json
23:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P55430 and previous config saved to /var/cache/conftool/dbconfig/20240123-233749-marostegui.json
23:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T354336)', diff saved to https://phabricator.wikimedia.org/P55429 and previous config saved to /var/cache/conftool/dbconfig/20240123-232242-marostegui.json
23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T354336)', diff saved to https://phabricator.wikimedia.org/P55428 and previous config saved to /var/cache/conftool/dbconfig/20240123-232021-marostegui.json
23:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1219.eqiad.wmnet with reason: Maintenance
23:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1219.eqiad.wmnet with reason: Maintenance
23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T354336)', diff saved to https://phabricator.wikimedia.org/P55427 and previous config saved to /var/cache/conftool/dbconfig/20240123-231959-marostegui.json
23:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P55426 and previous config saved to /var/cache/conftool/dbconfig/20240123-230453-marostegui.json
22:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P55425 and previous config saved to /var/cache/conftool/dbconfig/20240123-224946-marostegui.json
22:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T354336)', diff saved to https://phabricator.wikimedia.org/P55424 and previous config saved to /var/cache/conftool/dbconfig/20240123-223439-marostegui.json
22:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T354336)', diff saved to https://phabricator.wikimedia.org/P55423 and previous config saved to /var/cache/conftool/dbconfig/20240123-223215-marostegui.json
22:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1218.eqiad.wmnet with reason: Maintenance
22:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1218.eqiad.wmnet with reason: Maintenance
22:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T354336)', diff saved to https://phabricator.wikimedia.org/P55422 and previous config saved to /var/cache/conftool/dbconfig/20240123-223153-marostegui.json
22:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P55421 and previous config saved to /var/cache/conftool/dbconfig/20240123-221646-marostegui.json
22:03 kostajh: UTC late deploys done
22:02 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwikibooks --signup --ip 195.70.81.86
22:02 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwikibooks --signup --ip 62.232.9.14
22:01 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwiki --signup --ip 195.70.81.86
22:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P55420 and previous config saved to /var/cache/conftool/dbconfig/20240123-220140-marostegui.json
22:01 kostajh: T355695 running mwscript resetAuthenticationThrottle.php --wiki=enwiki --signup --ip 62.232.9.14
21:59 kharlan@deploy2002: Finished scap: Backport for [knwiki] Removing the temporary logo (already reverted) (T338136), [itwiki] Add the 'abusefilter-bypass-blocked-external-domains' right to botadmins (T355694), [enwiki] and [enwikibooks] Throttle exemption for event (T355695) (duration: 15m 33s)
21:53 kharlan@deploy2002: superpes and kharlan: Continuing with sync
21:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T354336)', diff saved to https://phabricator.wikimedia.org/P55419 and previous config saved to /var/cache/conftool/dbconfig/20240123-214633-marostegui.json
21:45 kharlan@deploy2002: superpes and kharlan: Backport for [knwiki] Removing the temporary logo (already reverted) (T338136), [itwiki] Add the 'abusefilter-bypass-blocked-external-domains' right to botadmins (T355694), [enwiki] and [enwikibooks] Throttle exemption for event (T355695) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T354336)', diff saved to https://phabricator.wikimedia.org/P55418 and previous config saved to /var/cache/conftool/dbconfig/20240123-214413-marostegui.json
21:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1207.eqiad.wmnet with reason: Maintenance
21:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1207.eqiad.wmnet with reason: Maintenance
21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T354336)', diff saved to https://phabricator.wikimedia.org/P55417 and previous config saved to /var/cache/conftool/dbconfig/20240123-214351-marostegui.json
21:43 kharlan@deploy2002: Started scap: Backport for [knwiki] Removing the temporary logo (already reverted) (T338136), [itwiki] Add the 'abusefilter-bypass-blocked-external-domains' right to botadmins (T355694), [enwiki] and [enwikibooks] Throttle exemption for event (T355695)
21:36 kharlan@deploy2002: Finished scap: Backport for revertrisk: Fix i18n message reference (T348298), revertrisk: Fix i18n messages (T348298) (duration: 30m 51s)
21:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P55416 and previous config saved to /var/cache/conftool/dbconfig/20240123-212845-marostegui.json
21:26 kharlan@deploy2002: kharlan: Continuing with sync
21:26 kharlan@deploy2002: kharlan: Backport for revertrisk: Fix i18n message reference (T348298), revertrisk: Fix i18n messages (T348298) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P55415 and previous config saved to /var/cache/conftool/dbconfig/20240123-211338-marostegui.json
21:05 kharlan@deploy2002: Started scap: Backport for revertrisk: Fix i18n message reference (T348298), revertrisk: Fix i18n messages (T348298)
20:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T354336)', diff saved to https://phabricator.wikimedia.org/P55414 and previous config saved to /var/cache/conftool/dbconfig/20240123-205832-marostegui.json
20:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T354336)', diff saved to https://phabricator.wikimedia.org/P55413 and previous config saved to /var/cache/conftool/dbconfig/20240123-205611-marostegui.json
20:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1206.eqiad.wmnet with reason: Maintenance
20:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1206.eqiad.wmnet with reason: Maintenance
20:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T354336)', diff saved to https://phabricator.wikimedia.org/P55412 and previous config saved to /var/cache/conftool/dbconfig/20240123-205549-marostegui.json
20:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P55411 and previous config saved to /var/cache/conftool/dbconfig/20240123-204043-marostegui.json
20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P55410 and previous config saved to /var/cache/conftool/dbconfig/20240123-202536-marostegui.json
20:23 cstone: payments-wiki upgraded from c2138768 to a3691a8e
20:23 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
20:12 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
20:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T354336)', diff saved to https://phabricator.wikimedia.org/P55409 and previous config saved to /var/cache/conftool/dbconfig/20240123-201030-marostegui.json
20:08 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
20:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T354336)', diff saved to https://phabricator.wikimedia.org/P55408 and previous config saved to /var/cache/conftool/dbconfig/20240123-200809-marostegui.json
20:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
20:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
20:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1196.eqiad.wmnet with reason: Maintenance
20:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1196.eqiad.wmnet with reason: Maintenance
20:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T354336)', diff saved to https://phabricator.wikimedia.org/P55407 and previous config saved to /var/cache/conftool/dbconfig/20240123-200726-marostegui.json
19:57 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling both afterwards
19:57 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet, repooling both afterwards
19:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P55406 and previous config saved to /var/cache/conftool/dbconfig/20240123-195220-marostegui.json
19:49 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, test data xfer) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet, repooling both afterwards
19:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs[2024-2025].codfw.wmnet with reason: testing data xfter cookbook
19:45 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs[2024-2025].codfw.wmnet with reason: testing data xfter cookbook
19:45 mutante: phab1004 - /srv/phab/phabricator/bin/mail volume
19:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P55405 and previous config saved to /var/cache/conftool/dbconfig/20240123-193713-marostegui.json
19:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T354336)', diff saved to https://phabricator.wikimedia.org/P55404 and previous config saved to /var/cache/conftool/dbconfig/20240123-192207-marostegui.json
19:21 ejegg: fundraising civicrm upgraded from d8b0c977 to b85b6dde
19:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T354336)', diff saved to https://phabricator.wikimedia.org/P55403 and previous config saved to /var/cache/conftool/dbconfig/20240123-191945-marostegui.json
19:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1186.eqiad.wmnet with reason: Maintenance
19:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1186.eqiad.wmnet with reason: Maintenance
19:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T354336)', diff saved to https://phabricator.wikimedia.org/P55402 and previous config saved to /var/cache/conftool/dbconfig/20240123-191922-marostegui.json
19:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P55401 and previous config saved to /var/cache/conftool/dbconfig/20240123-190416-marostegui.json
18:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P55400 and previous config saved to /var/cache/conftool/dbconfig/20240123-184909-marostegui.json
18:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
18:37 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
18:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
18:36 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
18:35 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
18:35 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003.eqiad.wmnet']
18:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T354336)', diff saved to https://phabricator.wikimedia.org/P55399 and previous config saved to /var/cache/conftool/dbconfig/20240123-183403-marostegui.json
18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T354336)', diff saved to https://phabricator.wikimedia.org/P55398 and previous config saved to /var/cache/conftool/dbconfig/20240123-183141-marostegui.json
18:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1169.eqiad.wmnet with reason: Maintenance
18:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1169.eqiad.wmnet with reason: Maintenance
18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55397 and previous config saved to /var/cache/conftool/dbconfig/20240123-183120-marostegui.json
18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P55396 and previous config saved to /var/cache/conftool/dbconfig/20240123-181613-marostegui.json
18:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P55395 and previous config saved to /var/cache/conftool/dbconfig/20240123-180107-marostegui.json
17:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55394 and previous config saved to /var/cache/conftool/dbconfig/20240123-174600-marostegui.json
17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55393 and previous config saved to /var/cache/conftool/dbconfig/20240123-174339-marostegui.json
17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
17:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
17:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
17:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
17:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
17:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T354336)', diff saved to https://phabricator.wikimedia.org/P55392 and previous config saved to /var/cache/conftool/dbconfig/20240123-174215-marostegui.json
17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P55391 and previous config saved to /var/cache/conftool/dbconfig/20240123-172709-marostegui.json
17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P55390 and previous config saved to /var/cache/conftool/dbconfig/20240123-171202-marostegui.json
16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T354336)', diff saved to https://phabricator.wikimedia.org/P55389 and previous config saved to /var/cache/conftool/dbconfig/20240123-165656-marostegui.json
16:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1135 (T354336)', diff saved to https://phabricator.wikimedia.org/P55388 and previous config saved to /var/cache/conftool/dbconfig/20240123-165433-marostegui.json
16:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1135.eqiad.wmnet with reason: Maintenance
16:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1135.eqiad.wmnet with reason: Maintenance
16:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1106.eqiad.wmnet with reason: Maintenance
16:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1106.eqiad.wmnet with reason: Maintenance
16:49 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1003.eqiad.wmnet with OS bookworm
16:39 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest1003.eqiad.wmnet with OS bookworm
16:14 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device lsw1-f8-eqiad
16:14 ayounsi@cumin1002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
16:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55387 and previous config saved to /var/cache/conftool/dbconfig/20240123-161426-root.json
16:10 sukhe: enable puppet on A:lvs to merge CR 991785 and run agent on all nodes
15:59 sukhe: disable puppet on A:lvs to merge CR 991785
15:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55386 and previous config saved to /var/cache/conftool/dbconfig/20240123-155921-root.json
15:55 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
15:54 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
15:54 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
15:53 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
15:52 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
15:52 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
15:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55385 and previous config saved to /var/cache/conftool/dbconfig/20240123-155219-ladsgroup.json
15:44 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55384 and previous config saved to /var/cache/conftool/dbconfig/20240123-154416-root.json
15:41 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
15:39 claime: trafficserver: move 30% of traffic to mw on k8s - T355532
15:37 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
15:37 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
15:37 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
15:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55383 and previous config saved to /var/cache/conftool/dbconfig/20240123-153712-ladsgroup.json
15:36 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
15:36 claime: Bumping mw-api-ext replicas - T355532
15:36 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
15:36 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
15:35 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
15:35 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
15:35 claime: Bumping mw-web replicas - T355532
15:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
15:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
15:32 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
15:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
15:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
15:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
15:29 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55382 and previous config saved to /var/cache/conftool/dbconfig/20240123-152911-root.json
15:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
15:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55381 and previous config saved to /var/cache/conftool/dbconfig/20240123-152206-ladsgroup.json
15:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
15:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
15:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
15:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
15:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
15:14 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55380 and previous config saved to /var/cache/conftool/dbconfig/20240123-151406-root.json
15:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2165.codfw.wmnet with reason: Maintenance
15:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2165.codfw.wmnet with reason: Maintenance
15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] DONE helmfile.d/services/termbox: apply
15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [codfw] START helmfile.d/services/termbox: apply
15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] DONE helmfile.d/services/termbox: apply
15:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [eqiad] START helmfile.d/services/termbox: apply
15:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55379 and previous config saved to /var/cache/conftool/dbconfig/20240123-150659-ladsgroup.json
15:06 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] DONE helmfile.d/services/termbox: apply
15:05 logmsgbot: lucaswerkmeister-wmde@deploy2002 helmfile [staging] START helmfile.d/services/termbox: apply
15:00 Lucas_WMDE: UTC afternoon backport+config window done
14:59 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for ORES: Enable renamed revertrisklanguageagnostic model (T348298) (duration: 11m 20s)
14:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55378 and previous config saved to /var/cache/conftool/dbconfig/20240123-145901-root.json
14:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T354336)', diff saved to https://phabricator.wikimedia.org/P55377 and previous config saved to /var/cache/conftool/dbconfig/20240123-145353-marostegui.json
14:53 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and kharlan: Continuing with sync
14:49 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and kharlan: Backport for ORES: Enable renamed revertrisklanguageagnostic model (T348298) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:48 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for ORES: Enable renamed revertrisklanguageagnostic model (T348298)
14:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1173.eqiad.wmnet with OS bookworm
14:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55376 and previous config saved to /var/cache/conftool/dbconfig/20240123-144356-root.json
14:42 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Restore support for matching 'LIKE' patterns/wildcards (T355478) (duration: 07m 50s)
14:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P55375 and previous config saved to /var/cache/conftool/dbconfig/20240123-143846-marostegui.json
14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Continuing with sync
14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Backport for Restore support for matching 'LIKE' patterns/wildcards (T355478) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:34 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Restore support for matching 'LIKE' patterns/wildcards (T355478)
14:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Restore support for matching 'LIKE' patterns/wildcards (T355478) (duration: 10m 29s)
14:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet
14:32 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1003.eqiad.wmnet
14:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Continuing with sync
14:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1173.eqiad.wmnet with reason: host reimage
14:24 logmsgbot: lucaswerkmeister-wmde@deploy2002 matmarex and lucaswerkmeister-wmde: Backport for Restore support for matching 'LIKE' patterns/wildcards (T355478) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:24 pt1979@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet
14:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1173.eqiad.wmnet with reason: host reimage
14:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P55374 and previous config saved to /var/cache/conftool/dbconfig/20240123-142339-marostegui.json
14:23 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Restore support for matching 'LIKE' patterns/wildcards (T355478)
14:20 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
14:18 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for ext-EventLogging,ext-EventStreamConfig: Remove mediawiki.special_diff_interactions stream (T353366) (duration: 11m 49s)
14:15 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1003.eqiad.wmnet
14:12 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and phuedx: Continuing with sync
14:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1173.eqiad.wmnet with OS bookworm
14:08 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and phuedx: Backport for ext-EventLogging,ext-EventStreamConfig: Remove mediawiki.special_diff_interactions stream (T353366) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T354336)', diff saved to https://phabricator.wikimedia.org/P55373 and previous config saved to /var/cache/conftool/dbconfig/20240123-140833-marostegui.json
14:07 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
14:06 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for ext-EventLogging,ext-EventStreamConfig: Remove mediawiki.special_diff_interactions stream (T353366)
14:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1173 (T343718)', diff saved to https://phabricator.wikimedia.org/P55372 and previous config saved to /var/cache/conftool/dbconfig/20240123-140636-ladsgroup.json
14:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
14:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
14:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
14:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
13:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T354336)', diff saved to https://phabricator.wikimedia.org/P55371 and previous config saved to /var/cache/conftool/dbconfig/20240123-135819-marostegui.json
13:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2195.codfw.wmnet with reason: Maintenance
13:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2195.codfw.wmnet with reason: Maintenance
13:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T354336)', diff saved to https://phabricator.wikimedia.org/P55370 and previous config saved to /var/cache/conftool/dbconfig/20240123-135757-marostegui.json
13:52 Dreamy_Jazz: Ran `foreachwikiindblist group0 extensions/MediaModeration/maintenance/resendMatchEmails.php 20200405 --verbose`
13:51 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
13:50 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
13:50 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
13:49 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
13:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1016.eqiad.wmnet with OS bullseye
13:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P55369 and previous config saved to /var/cache/conftool/dbconfig/20240123-134250-marostegui.json
13:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P55368 and previous config saved to /var/cache/conftool/dbconfig/20240123-132744-marostegui.json
13:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55367 and previous config saved to /var/cache/conftool/dbconfig/20240123-131909-root.json
13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
13:12 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
13:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T354336)', diff saved to https://phabricator.wikimedia.org/P55366 and previous config saved to /var/cache/conftool/dbconfig/20240123-131237-marostegui.json
13:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T354336)', diff saved to https://phabricator.wikimedia.org/P55365 and previous config saved to /var/cache/conftool/dbconfig/20240123-131027-marostegui.json
13:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2181.codfw.wmnet with reason: Maintenance
13:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2181.codfw.wmnet with reason: Maintenance
13:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55364 and previous config saved to /var/cache/conftool/dbconfig/20240123-131005-marostegui.json
13:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55363 and previous config saved to /var/cache/conftool/dbconfig/20240123-130404-root.json
12:56 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye
12:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P55362 and previous config saved to /var/cache/conftool/dbconfig/20240123-125459-marostegui.json
12:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55361 and previous config saved to /var/cache/conftool/dbconfig/20240123-124859-root.json
12:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1017.eqiad.wmnet
12:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P55360 and previous config saved to /var/cache/conftool/dbconfig/20240123-123952-marostegui.json
12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55359 and previous config saved to /var/cache/conftool/dbconfig/20240123-123354-root.json
12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55358 and previous config saved to /var/cache/conftool/dbconfig/20240123-123346-root.json
12:31 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1017.eqiad.wmnet
12:28 claime: Restarting killed maintenance job mediawiki_job_MachineVision_prioritize_uncategorized.service
12:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sretest1001.eqiad.wmnet
12:26 kamila@cumin1002: START - Cookbook sre.hosts.remove-downtime for sretest1001.eqiad.wmnet
12:26 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on sretest1001.eqiad.wmnet with reason: testing the cookbook
12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55357 and previous config saved to /var/cache/conftool/dbconfig/20240123-122446-marostegui.json
12:23 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on sretest1001.eqiad.wmnet with reason: testing the cookbook
12:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55356 and previous config saved to /var/cache/conftool/dbconfig/20240123-122336-marostegui.json
12:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
12:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
12:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55355 and previous config saved to /var/cache/conftool/dbconfig/20240123-122314-marostegui.json
12:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55354 and previous config saved to /var/cache/conftool/dbconfig/20240123-122105-root.json
12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55353 and previous config saved to /var/cache/conftool/dbconfig/20240123-121849-root.json
12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55352 and previous config saved to /var/cache/conftool/dbconfig/20240123-121841-root.json
12:17 claime: Restarting ferm.service on k8s node mw1495.eqiad.wmnet - T354855
12:16 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host snapshot1016.eqiad.wmnet
12:14 claime: scap::dsh::scap_proxies: Replace mw1486 by mw1405 - T355622
12:13 Amir1: dropping bv2015_edits table from all wikis (T355594)
12:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P55351 and previous config saved to /var/cache/conftool/dbconfig/20240123-120807-marostegui.json
12:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55350 and previous config saved to /var/cache/conftool/dbconfig/20240123-120600-root.json
12:05 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host snapshot1016.eqiad.wmnet
12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55349 and previous config saved to /var/cache/conftool/dbconfig/20240123-120344-root.json
12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55348 and previous config saved to /var/cache/conftool/dbconfig/20240123-120335-root.json
12:03 Amir1: dropping bv2009_edits table from all wikis (T355594)
12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1017.eqiad.wmnet with OS bullseye
11:54 godog: initial cleanup of replicated thanos blocks - T351927
11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P55347 and previous config saved to /var/cache/conftool/dbconfig/20240123-115301-marostegui.json
11:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55346 and previous config saved to /var/cache/conftool/dbconfig/20240123-115055-root.json
11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55345 and previous config saved to /var/cache/conftool/dbconfig/20240123-114840-root.json
11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55344 and previous config saved to /var/cache/conftool/dbconfig/20240123-114831-root.json
11:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1173', diff saved to https://phabricator.wikimedia.org/P55343 and previous config saved to /var/cache/conftool/dbconfig/20240123-114826-marostegui.json
11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55342 and previous config saved to /var/cache/conftool/dbconfig/20240123-113754-marostegui.json
11:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55341 and previous config saved to /var/cache/conftool/dbconfig/20240123-113550-root.json
11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167:3318 (T354336)', diff saved to https://phabricator.wikimedia.org/P55340 and previous config saved to /var/cache/conftool/dbconfig/20240123-113544-marostegui.json
11:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
11:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55339 and previous config saved to /var/cache/conftool/dbconfig/20240123-113522-marostegui.json
11:35 marostegui: Starting s6 eqiad failover from db1173 to db1231 - T355660
11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
11:31 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1017.eqiad.wmnet with reason: host reimage
11:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55338 and previous config saved to /var/cache/conftool/dbconfig/20240123-112420-root.json
11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P55336 and previous config saved to /var/cache/conftool/dbconfig/20240123-112016-marostegui.json
11:11 Amir1: dropping pif_edits table from all wikis (T355594)
11:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host snapshot1017.eqiad.wmnet with OS bullseye
11:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55335 and previous config saved to /var/cache/conftool/dbconfig/20240123-110915-root.json
11:07 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1231 with weight 0 T355660', diff saved to https://phabricator.wikimedia.org/P55333 and previous config saved to /var/cache/conftool/dbconfig/20240123-110743-marostegui.json
11:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355660
11:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s6 T355660
11:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55332 and previous config saved to /var/cache/conftool/dbconfig/20240123-110540-root.json
11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P55331 and previous config saved to /var/cache/conftool/dbconfig/20240123-110509-marostegui.json
10:58 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-master1002.eqiad.wmnet
10:58 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:58 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
10:56 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
10:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2171.codfw.wmnet with OS bookworm
10:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3316 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55330 and previous config saved to /var/cache/conftool/dbconfig/20240123-105410-root.json
10:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2171:3315 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55329 and previous config saved to /var/cache/conftool/dbconfig/20240123-105035-root.json
10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55328 and previous config saved to /var/cache/conftool/dbconfig/20240123-105003-marostegui.json
10:48 btullis@cumin1002: START - Cookbook sre.dns.netbox
10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55327 and previous config saved to /var/cache/conftool/dbconfig/20240123-104753-marostegui.json
10:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2166.codfw.wmnet with reason: Maintenance
10:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2166.codfw.wmnet with reason: Maintenance
10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55326 and previous config saved to /var/cache/conftool/dbconfig/20240123-104731-marostegui.json
10:43 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-master1002.eqiad.wmnet
10:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2171.codfw.wmnet with reason: host reimage
10:34 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-master1001.eqiad.wmnet
10:34 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:34 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
10:32 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-master1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P55325 and previous config saved to /var/cache/conftool/dbconfig/20240123-103225-marostegui.json
10:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2171.codfw.wmnet with reason: host reimage
10:27 btullis@cumin1002: START - Cookbook sre.dns.netbox
10:23 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1017.eqiad.wmnet with OS bullseye
10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P55324 and previous config saved to /var/cache/conftool/dbconfig/20240123-101718-marostegui.json
10:13 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet
10:13 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1003.eqiad.wmnet
10:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2171.codfw.wmnet with OS bookworm
10:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2171:3315 db2171:3316', diff saved to https://phabricator.wikimedia.org/P55323 and previous config saved to /var/cache/conftool/dbconfig/20240123-101056-marostegui.json
10:10 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-master1001.eqiad.wmnet
10:04 ayounsi@cumin1002: START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet
10:04 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
10:03 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1003.eqiad.wmnet
10:03 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
10:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host snapshot1016.eqiad.wmnet with OS bullseye
10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55322 and previous config saved to /var/cache/conftool/dbconfig/20240123-100212-marostegui.json
10:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T354336)', diff saved to https://phabricator.wikimedia.org/P55321 and previous config saved to /var/cache/conftool/dbconfig/20240123-100002-marostegui.json
09:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
09:59 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts sretest1003.eqiad.wmnet
09:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
09:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2164.codfw.wmnet with reason: Maintenance
09:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2164.codfw.wmnet with reason: Maintenance
09:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55320 and previous config saved to /var/cache/conftool/dbconfig/20240123-095923-marostegui.json
09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P55319 and previous config saved to /var/cache/conftool/dbconfig/20240123-094417-marostegui.json
09:41 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
09:29 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1016.eqiad.wmnet with reason: host reimage
09:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P55318 and previous config saved to /var/cache/conftool/dbconfig/20240123-092910-marostegui.json
09:24 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.15 refs T354433
09:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55317 and previous config saved to /var/cache/conftool/dbconfig/20240123-091404-marostegui.json
09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T354336)', diff saved to https://phabricator.wikimedia.org/P55316 and previous config saved to /var/cache/conftool/dbconfig/20240123-091154-marostegui.json
09:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2163.codfw.wmnet with reason: Maintenance
09:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2163.codfw.wmnet with reason: Maintenance
09:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55315 and previous config saved to /var/cache/conftool/dbconfig/20240123-091132-marostegui.json
09:04 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1003.eqiad.wmnet
09:01 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1003.eqiad.wmnet
09:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55314 and previous config saved to /var/cache/conftool/dbconfig/20240123-090104-root.json
08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P55313 and previous config saved to /var/cache/conftool/dbconfig/20240123-085625-marostegui.json
08:55 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992245/ https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992359/
08:51 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1016.eqiad.wmnet with OS bullseye
08:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55312 and previous config saved to /var/cache/conftool/dbconfig/20240123-084559-root.json
08:44 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
08:44 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
08:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55311 and previous config saved to /var/cache/conftool/dbconfig/20240123-084301-ladsgroup.json
08:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
08:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
08:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
08:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
08:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55310 and previous config saved to /var/cache/conftool/dbconfig/20240123-084244-ladsgroup.json
08:41 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P55309 and previous config saved to /var/cache/conftool/dbconfig/20240123-084119-marostegui.json
08:39 ayounsi@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
08:37 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
08:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55308 and previous config saved to /var/cache/conftool/dbconfig/20240123-083054-root.json
08:28 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/992244
08:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55307 and previous config saved to /var/cache/conftool/dbconfig/20240123-082738-ladsgroup.json
08:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55306 and previous config saved to /var/cache/conftool/dbconfig/20240123-082613-marostegui.json
08:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T354336)', diff saved to https://phabricator.wikimedia.org/P55305 and previous config saved to /var/cache/conftool/dbconfig/20240123-082402-marostegui.json
08:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2162.codfw.wmnet with reason: Maintenance
08:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2162.codfw.wmnet with reason: Maintenance
08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T354336)', diff saved to https://phabricator.wikimedia.org/P55304 and previous config saved to /var/cache/conftool/dbconfig/20240123-082340-marostegui.json
08:15 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55303 and previous config saved to /var/cache/conftool/dbconfig/20240123-081549-root.json
08:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55302 and previous config saved to /var/cache/conftool/dbconfig/20240123-081231-ladsgroup.json
08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P55301 and previous config saved to /var/cache/conftool/dbconfig/20240123-080834-marostegui.json
08:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2051.codfw.wmnet
08:00 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55300 and previous config saved to /var/cache/conftool/dbconfig/20240123-080044-root.json
07:57 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2051.codfw.wmnet
07:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55299 and previous config saved to /var/cache/conftool/dbconfig/20240123-075725-ladsgroup.json
07:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1051.eqiad.wmnet
07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P55298 and previous config saved to /var/cache/conftool/dbconfig/20240123-075327-marostegui.json
07:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1051.eqiad.wmnet
07:45 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55297 and previous config saved to /var/cache/conftool/dbconfig/20240123-074538-root.json
07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T354336)', diff saved to https://phabricator.wikimedia.org/P55296 and previous config saved to /var/cache/conftool/dbconfig/20240123-073821-marostegui.json
07:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T354336)', diff saved to https://phabricator.wikimedia.org/P55295 and previous config saved to /var/cache/conftool/dbconfig/20240123-073610-marostegui.json
07:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2161.codfw.wmnet with reason: Maintenance
07:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2161.codfw.wmnet with reason: Maintenance
07:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T354336)', diff saved to https://phabricator.wikimedia.org/P55294 and previous config saved to /var/cache/conftool/dbconfig/20240123-073548-marostegui.json
07:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1231.eqiad.wmnet with OS bookworm
07:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1231 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55293 and previous config saved to /var/cache/conftool/dbconfig/20240123-073033-root.json
07:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P55292 and previous config saved to /var/cache/conftool/dbconfig/20240123-073021-ladsgroup.json
07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P55291 and previous config saved to /var/cache/conftool/dbconfig/20240123-072041-marostegui.json
07:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55290 and previous config saved to /var/cache/conftool/dbconfig/20240123-071515-ladsgroup.json
07:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
07:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
07:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P55289 and previous config saved to /var/cache/conftool/dbconfig/20240123-070535-marostegui.json
07:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P55288 and previous config saved to /var/cache/conftool/dbconfig/20240123-070008-ladsgroup.json
06:57 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1231.eqiad.wmnet with OS bookworm
06:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1231', diff saved to https://phabricator.wikimedia.org/P55287 and previous config saved to /var/cache/conftool/dbconfig/20240123-065606-marostegui.json
06:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T354336)', diff saved to https://phabricator.wikimedia.org/P55285 and previous config saved to /var/cache/conftool/dbconfig/20240123-065029-marostegui.json
06:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T354336)', diff saved to https://phabricator.wikimedia.org/P55284 and previous config saved to /var/cache/conftool/dbconfig/20240123-064819-marostegui.json
06:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2154.codfw.wmnet with reason: Maintenance
06:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2154.codfw.wmnet with reason: Maintenance
06:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T354336)', diff saved to https://phabricator.wikimedia.org/P55283 and previous config saved to /var/cache/conftool/dbconfig/20240123-064757-marostegui.json
06:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P55282 and previous config saved to /var/cache/conftool/dbconfig/20240123-064502-ladsgroup.json
06:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P55281 and previous config saved to /var/cache/conftool/dbconfig/20240123-063250-marostegui.json
06:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P55280 and previous config saved to /var/cache/conftool/dbconfig/20240123-061744-marostegui.json
06:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T354336)', diff saved to https://phabricator.wikimedia.org/P55279 and previous config saved to /var/cache/conftool/dbconfig/20240123-060237-marostegui.json
06:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T354336)', diff saved to https://phabricator.wikimedia.org/P55278 and previous config saved to /var/cache/conftool/dbconfig/20240123-060127-marostegui.json
06:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance
06:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance
06:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
06:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
05:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1209.eqiad.wmnet with reason: Maintenance
05:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1209.eqiad.wmnet with reason: Maintenance
04:54 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.15 refs T354433 (duration: 51m 22s)
04:02 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.15 refs T354433
01:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55277 and previous config saved to /var/cache/conftool/dbconfig/20240123-011434-ladsgroup.json
01:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
01:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
00:58 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=ruwikinews --fix # T350889
00:57 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=fiwikinews --fix # T350889
00:57 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=fiwiki --fix # T350889
00:56 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=enwiki --fix # T350889
00:55 zabe: zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=cywiki --fix # T350889
00:42 zabe: running 'zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=viwiki --fix' in screen
00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P55276 and previous config saved to /var/cache/conftool/dbconfig/20240123-003338-ladsgroup.json
00:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
00:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P55275 and previous config saved to /var/cache/conftool/dbconfig/20240123-003316-ladsgroup.json
00:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55274 and previous config saved to /var/cache/conftool/dbconfig/20240123-001810-ladsgroup.json
00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P55273 and previous config saved to /var/cache/conftool/dbconfig/20240123-000303-ladsgroup.json

2024-01-22

23:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P55272 and previous config saved to /var/cache/conftool/dbconfig/20240122-234757-ladsgroup.json
23:14 zabe@deploy2002: Finished scap: Backport for Stop setting wgShowIPinHeader (T355479), beta: Start reading from af_user(_text)/afh_user(_text) (T355616) (duration: 07m 31s)
23:08 zabe@deploy2002: zabe: Continuing with sync
23:08 zabe@deploy2002: zabe: Backport for Stop setting wgShowIPinHeader (T355479), beta: Start reading from af_user(_text)/afh_user(_text) (T355616) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
23:06 zabe@deploy2002: Started scap: Backport for Stop setting wgShowIPinHeader (T355479), beta: Start reading from af_user(_text)/afh_user(_text) (T355616)
22:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
22:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
22:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T354336)', diff saved to https://phabricator.wikimedia.org/P55271 and previous config saved to /var/cache/conftool/dbconfig/20240122-225618-marostegui.json
22:47 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088']
22:47 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088']
22:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P55270 and previous config saved to /var/cache/conftool/dbconfig/20240122-224111-marostegui.json
22:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P55269 and previous config saved to /var/cache/conftool/dbconfig/20240122-222605-marostegui.json
22:24 maryum: Deployed patch for T355538
22:14 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
22:14 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
22:13 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - taavi@cumin1002"
22:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T354336)', diff saved to https://phabricator.wikimedia.org/P55268 and previous config saved to /var/cache/conftool/dbconfig/20240122-221058-marostegui.json
22:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T354336)', diff saved to https://phabricator.wikimedia.org/P55267 and previous config saved to /var/cache/conftool/dbconfig/20240122-220850-marostegui.json
22:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1226.eqiad.wmnet with reason: Maintenance
22:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1226.eqiad.wmnet with reason: Maintenance
22:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
22:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
22:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T354336)', diff saved to https://phabricator.wikimedia.org/P55266 and previous config saved to /var/cache/conftool/dbconfig/20240122-220811-marostegui.json
21:56 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
21:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P55265 and previous config saved to /var/cache/conftool/dbconfig/20240122-215305-marostegui.json
21:53 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
21:51 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
21:51 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add cloudrabbit1003 cloud-private address - taavi@cumin1002"
21:50 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add cloudrabbit1003 cloud-private address - taavi@cumin1002"
21:48 taavi@cumin1002: START - Cookbook sre.dns.netbox
21:46 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set cloudrabbit1003 as active - taavi@cumin1002"
21:45 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set cloudrabbit1003 as active - taavi@cumin1002"
21:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P55264 and previous config saved to /var/cache/conftool/dbconfig/20240122-213758-marostegui.json
21:33 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
21:32 taavi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
21:24 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
21:24 taavi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudrabbit1003.eqiad.wmnet with OS bookworm
21:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T354336)', diff saved to https://phabricator.wikimedia.org/P55263 and previous config saved to /var/cache/conftool/dbconfig/20240122-212252-marostegui.json
21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T354336)', diff saved to https://phabricator.wikimedia.org/P55262 and previous config saved to /var/cache/conftool/dbconfig/20240122-212144-marostegui.json
21:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1214.eqiad.wmnet with reason: Maintenance
21:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1214.eqiad.wmnet with reason: Maintenance
21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T354336)', diff saved to https://phabricator.wikimedia.org/P55261 and previous config saved to /var/cache/conftool/dbconfig/20240122-212122-marostegui.json
21:17 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS bookworm
21:07 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudrabbit1003
21:07 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudrabbit1003
21:07 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
21:07 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: allocate IPs for cloudrabbit1003 - taavi@cumin1002"
21:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P55260 and previous config saved to /var/cache/conftool/dbconfig/20240122-210615-marostegui.json
21:05 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: allocate IPs for cloudrabbit1003 - taavi@cumin1002"
21:03 taavi@cumin1002: START - Cookbook sre.dns.netbox
20:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P55259 and previous config saved to /var/cache/conftool/dbconfig/20240122-205109-marostegui.json
20:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T354336)', diff saved to https://phabricator.wikimedia.org/P55258 and previous config saved to /var/cache/conftool/dbconfig/20240122-203602-marostegui.json
20:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1211 (T354336)', diff saved to https://phabricator.wikimedia.org/P55257 and previous config saved to /var/cache/conftool/dbconfig/20240122-203354-marostegui.json
20:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1211.eqiad.wmnet with reason: Maintenance
20:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1211.eqiad.wmnet with reason: Maintenance
20:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T354336)', diff saved to https://phabricator.wikimedia.org/P55256 and previous config saved to /var/cache/conftool/dbconfig/20240122-203332-marostegui.json
20:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P55255 and previous config saved to /var/cache/conftool/dbconfig/20240122-201826-marostegui.json
20:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P55254 and previous config saved to /var/cache/conftool/dbconfig/20240122-200319-marostegui.json
19:57 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
19:56 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
19:56 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
19:55 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
19:54 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
19:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
19:51 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
19:50 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
19:50 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
19:48 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
19:48 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
19:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T354336)', diff saved to https://phabricator.wikimedia.org/P55253 and previous config saved to /var/cache/conftool/dbconfig/20240122-194813-marostegui.json
19:47 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T354336)', diff saved to https://phabricator.wikimedia.org/P55252 and previous config saved to /var/cache/conftool/dbconfig/20240122-194704-marostegui.json
19:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1203.eqiad.wmnet with reason: Maintenance
19:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1203.eqiad.wmnet with reason: Maintenance
19:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T354336)', diff saved to https://phabricator.wikimedia.org/P55251 and previous config saved to /var/cache/conftool/dbconfig/20240122-194642-marostegui.json
19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P55250 and previous config saved to /var/cache/conftool/dbconfig/20240122-193136-marostegui.json
19:28 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
19:28 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
19:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P55249 and previous config saved to /var/cache/conftool/dbconfig/20240122-191629-marostegui.json
19:06 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
19:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T354336)', diff saved to https://phabricator.wikimedia.org/P55248 and previous config saved to /var/cache/conftool/dbconfig/20240122-190123-marostegui.json
19:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1193 (T354336)', diff saved to https://phabricator.wikimedia.org/P55247 and previous config saved to /var/cache/conftool/dbconfig/20240122-190014-marostegui.json
19:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance
19:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance
18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T354336)', diff saved to https://phabricator.wikimedia.org/P55246 and previous config saved to /var/cache/conftool/dbconfig/20240122-185952-marostegui.json
18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P55245 and previous config saved to /var/cache/conftool/dbconfig/20240122-184446-marostegui.json
18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P55244 and previous config saved to /var/cache/conftool/dbconfig/20240122-182939-marostegui.json
18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P55243 and previous config saved to /var/cache/conftool/dbconfig/20240122-182432-ladsgroup.json
18:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
18:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55242 and previous config saved to /var/cache/conftool/dbconfig/20240122-182359-ladsgroup.json
18:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T354336)', diff saved to https://phabricator.wikimedia.org/P55241 and previous config saved to /var/cache/conftool/dbconfig/20240122-181433-marostegui.json
18:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T354336)', diff saved to https://phabricator.wikimedia.org/P55240 and previous config saved to /var/cache/conftool/dbconfig/20240122-181324-marostegui.json
18:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1192.eqiad.wmnet with reason: Maintenance
18:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1192.eqiad.wmnet with reason: Maintenance
18:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T354336)', diff saved to https://phabricator.wikimedia.org/P55239 and previous config saved to /var/cache/conftool/dbconfig/20240122-181302-marostegui.json
18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55238 and previous config saved to /var/cache/conftool/dbconfig/20240122-180853-ladsgroup.json
17:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P55237 and previous config saved to /var/cache/conftool/dbconfig/20240122-175755-marostegui.json
17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P55236 and previous config saved to /var/cache/conftool/dbconfig/20240122-175346-ladsgroup.json
17:46 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
17:44 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P55235 and previous config saved to /var/cache/conftool/dbconfig/20240122-174249-marostegui.json
17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55234 and previous config saved to /var/cache/conftool/dbconfig/20240122-173840-ladsgroup.json
17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T354336)', diff saved to https://phabricator.wikimedia.org/P55233 and previous config saved to /var/cache/conftool/dbconfig/20240122-172743-marostegui.json
17:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1178 (T354336)', diff saved to https://phabricator.wikimedia.org/P55232 and previous config saved to /var/cache/conftool/dbconfig/20240122-172635-marostegui.json
17:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1178.eqiad.wmnet with reason: Maintenance
17:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1178.eqiad.wmnet with reason: Maintenance
17:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55231 and previous config saved to /var/cache/conftool/dbconfig/20240122-172612-marostegui.json
17:17 akosiaris: draining kubestage2001, uncordoning kubestage2002 to allow it to receive the pods. T355437
17:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P55230 and previous config saved to /var/cache/conftool/dbconfig/20240122-171106-marostegui.json
17:05 vgutierrez: restore HAProxy tune.bufsize = 16684 in cp3066 - T354424
16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P55229 and previous config saved to /var/cache/conftool/dbconfig/20240122-165559-marostegui.json
16:53 vgutierrez: testing HAProxy tune.bufsize = 32768 in cp3066 - T354424
16:46 dcausse@deploy2002: Finished deploy [airflow-dags/search@dcf08b2]: (no justification provided) (duration: 00m 31s)
16:46 dcausse@deploy2002: Started deploy [airflow-dags/search@dcf08b2]: (no justification provided)
16:42 Daimona: T353459 Running mwscript /home/daimona/GenerateInvitationList.php to test the script before it reaches production
16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55228 and previous config saved to /var/cache/conftool/dbconfig/20240122-164053-marostegui.json
16:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1495.eqiad.wmnet with OS bullseye
16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55227 and previous config saved to /var/cache/conftool/dbconfig/20240122-163844-marostegui.json
16:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1177.eqiad.wmnet with reason: Maintenance
16:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1177.eqiad.wmnet with reason: Maintenance
16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55226 and previous config saved to /var/cache/conftool/dbconfig/20240122-163822-marostegui.json
16:38 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
16:38 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
16:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P55225 and previous config saved to /var/cache/conftool/dbconfig/20240122-163808-ladsgroup.json
16:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
16:38 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
16:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
16:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
16:37 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
16:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
16:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P55224 and previous config saved to /var/cache/conftool/dbconfig/20240122-163729-ladsgroup.json
16:31 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1486.eqiad.wmnet with OS bullseye
16:29 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
16:29 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
16:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P55222 and previous config saved to /var/cache/conftool/dbconfig/20240122-162315-marostegui.json
16:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55221 and previous config saved to /var/cache/conftool/dbconfig/20240122-162223-ladsgroup.json
16:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1495.eqiad.wmnet with reason: host reimage
16:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1486.eqiad.wmnet with reason: host reimage
16:09 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1495.eqiad.wmnet with reason: host reimage
16:08 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1486.eqiad.wmnet with reason: host reimage
16:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P55220 and previous config saved to /var/cache/conftool/dbconfig/20240122-160809-marostegui.json
16:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P55219 and previous config saved to /var/cache/conftool/dbconfig/20240122-160716-ladsgroup.json
15:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55218 and previous config saved to /var/cache/conftool/dbconfig/20240122-155607-root.json
15:55 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1495.eqiad.wmnet with OS bullseye
15:55 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1486.eqiad.wmnet with OS bullseye
15:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55217 and previous config saved to /var/cache/conftool/dbconfig/20240122-155302-marostegui.json
15:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P55216 and previous config saved to /var/cache/conftool/dbconfig/20240122-155210-ladsgroup.json
15:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T354336)', diff saved to https://phabricator.wikimedia.org/P55215 and previous config saved to /var/cache/conftool/dbconfig/20240122-155154-marostegui.json
15:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1172.eqiad.wmnet with reason: Maintenance
15:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1172.eqiad.wmnet with reason: Maintenance
15:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
15:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
15:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T354336)', diff saved to https://phabricator.wikimedia.org/P55214 and previous config saved to /var/cache/conftool/dbconfig/20240122-155115-marostegui.json
15:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55213 and previous config saved to /var/cache/conftool/dbconfig/20240122-154102-root.json
15:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P55212 and previous config saved to /var/cache/conftool/dbconfig/20240122-153608-marostegui.json
15:26 sukhe: sudo cumin -b1 -s120 "A:dns-rec and not P{dns6001*}" "enable-puppet 'do not enable' && run-puppet-agent"
15:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55211 and previous config saved to /var/cache/conftool/dbconfig/20240122-152557-root.json
15:24 sukhe: re-enable puppet on A:dns-rec and run agent to finish merging CR 979159
15:21 sukhe: enable puppet on dns6001 and run agent to test CR 979159
15:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P55210 and previous config saved to /var/cache/conftool/dbconfig/20240122-152102-marostegui.json
15:13 sukhe: disable Puppet on A:dns-rec to decouple anycast-hc and pdns-rec systemd binding: CR 979159
15:10 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55209 and previous config saved to /var/cache/conftool/dbconfig/20240122-151052-root.json
15:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T354336)', diff saved to https://phabricator.wikimedia.org/P55208 and previous config saved to /var/cache/conftool/dbconfig/20240122-150555-marostegui.json
15:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T354336)', diff saved to https://phabricator.wikimedia.org/P55207 and previous config saved to /var/cache/conftool/dbconfig/20240122-150046-marostegui.json
15:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
15:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
15:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1167.eqiad.wmnet with reason: Maintenance
15:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1167.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55206 and previous config saved to /var/cache/conftool/dbconfig/20240122-145548-root.json
14:55 hashar@deploy2002: Finished deploy [gerrit/gerrit@6257faa]: Update Zuul plugin for Gerrit 3.7 - T355521 (duration: 00m 07s)
14:54 hashar@deploy2002: Started deploy [gerrit/gerrit@6257faa]: Update Zuul plugin for Gerrit 3.7 - T355521
14:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2105.codfw.wmnet with reason: Maintenance
14:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2105.codfw.wmnet with reason: Maintenance
14:42 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
14:41 Lucas_WMDE: UTC afternoon backport+config window done
14:41 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
14:41 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
14:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Set ShowRollbackConfirmation in arwiki (T355213) (duration: 09m 07s)
14:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55205 and previous config saved to /var/cache/conftool/dbconfig/20240122-144043-root.json
14:40 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
14:40 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
14:35 logmsgbot: lucaswerkmeister-wmde@deploy2002 hubaishan and lucaswerkmeister-wmde: Continuing with sync
14:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 hubaishan and lucaswerkmeister-wmde: Backport for Set ShowRollbackConfirmation in arwiki (T355213) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:31 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Set ShowRollbackConfirmation in arwiki (T355213)
14:30 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Restrict pagequality-validate right to patroller in arwikisource (T354503) (duration: 09m 41s)
14:28 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1036.eqiad.wmnet to cluster eqiad and group B
14:26 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1036.eqiad.wmnet to cluster eqiad and group B
14:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55204 and previous config saved to /var/cache/conftool/dbconfig/20240122-142538-root.json
14:25 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1134', diff saved to https://phabricator.wikimedia.org/P55203 and previous config saved to /var/cache/conftool/dbconfig/20240122-142530-marostegui.json
14:24 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and hubaishan: Continuing with sync
14:21 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and hubaishan: Backport for Restrict pagequality-validate right to patroller in arwikisource (T354503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:20 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Restrict pagequality-validate right to patroller in arwikisource (T354503)
13:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1165.eqiad.wmnet with OS bookworm
13:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
13:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
13:24 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti1036.eqiad.wmnet
13:22 marostegui: Upgrade sanitarium master, there will be lag on s6 wiki replicas T354506
13:21 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1165.eqiad.wmnet with OS bookworm
13:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1165', diff saved to https://phabricator.wikimedia.org/P55201 and previous config saved to /var/cache/conftool/dbconfig/20240122-132023-marostegui.json
13:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2050.codfw.wmnet
13:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
13:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2049.codfw.wmnet
13:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1049.eqiad.wmnet
13:01 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2050.codfw.wmnet
13:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1050.eqiad.wmnet
12:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1049.eqiad.wmnet
12:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2049.codfw.wmnet
12:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1050.eqiad.wmnet
12:48 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
12:47 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
12:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55200 and previous config saved to /var/cache/conftool/dbconfig/20240122-123351-root.json
12:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55199 and previous config saved to /var/cache/conftool/dbconfig/20240122-122634-marostegui.json
12:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55198 and previous config saved to /var/cache/conftool/dbconfig/20240122-121846-root.json
12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P55197 and previous config saved to /var/cache/conftool/dbconfig/20240122-121128-marostegui.json
12:06 volans@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
12:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55195 and previous config saved to /var/cache/conftool/dbconfig/20240122-120341-root.json
11:56 volans@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
11:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P55193 and previous config saved to /var/cache/conftool/dbconfig/20240122-115621-marostegui.json
11:56 volans@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
11:56 volans@cumin1002: START - Cookbook sre.hosts.downtime for 0:20:00 on sretest1001.eqiad.wmnet with reason: Testing
11:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55192 and previous config saved to /var/cache/conftool/dbconfig/20240122-114836-root.json
11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55191 and previous config saved to /var/cache/conftool/dbconfig/20240122-114115-marostegui.json
11:41 vgutierrez: update to HAProxy 2.8.5 on cp3066 - T354424
11:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55190 and previous config saved to /var/cache/conftool/dbconfig/20240122-113331-root.json
11:26 jelto: start envoy on ticket-test.wikimedia.org to test alerting - T354479
11:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T354336)', diff saved to https://phabricator.wikimedia.org/P55189 and previous config saved to /var/cache/conftool/dbconfig/20240122-112401-marostegui.json
11:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2190.codfw.wmnet with reason: Maintenance
11:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2190.codfw.wmnet with reason: Maintenance
11:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55188 and previous config saved to /var/cache/conftool/dbconfig/20240122-112339-marostegui.json
11:21 jelto: stop envoy on ticket-test.wikimedia.org to test alerting - T354479
11:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55187 and previous config saved to /var/cache/conftool/dbconfig/20240122-111826-root.json
11:10 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2048.codfw.wmnet
11:10 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1048.eqiad.wmnet
11:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P55185 and previous config saved to /var/cache/conftool/dbconfig/20240122-110833-marostegui.json
11:04 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2048.codfw.wmnet
11:04 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1048.eqiad.wmnet
11:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2158 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55184 and previous config saved to /var/cache/conftool/dbconfig/20240122-110321-root.json
11:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2158.codfw.wmnet with OS bookworm
10:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P55183 and previous config saved to /var/cache/conftool/dbconfig/20240122-105326-marostegui.json
10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55182 and previous config saved to /var/cache/conftool/dbconfig/20240122-105237-root.json
10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55181 and previous config saved to /var/cache/conftool/dbconfig/20240122-105222-root.json
10:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55180 and previous config saved to /var/cache/conftool/dbconfig/20240122-103820-marostegui.json
10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55179 and previous config saved to /var/cache/conftool/dbconfig/20240122-103732-root.json
10:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55178 and previous config saved to /var/cache/conftool/dbconfig/20240122-103717-root.json
10:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P55177 and previous config saved to /var/cache/conftool/dbconfig/20240122-103520-ladsgroup.json
10:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
10:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
10:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55176 and previous config saved to /var/cache/conftool/dbconfig/20240122-102227-root.json
10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T354336)', diff saved to https://phabricator.wikimedia.org/P55175 and previous config saved to /var/cache/conftool/dbconfig/20240122-102220-marostegui.json
10:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2177.codfw.wmnet with reason: Maintenance
10:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55174 and previous config saved to /var/cache/conftool/dbconfig/20240122-102212-root.json
10:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2177.codfw.wmnet with reason: Maintenance
10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T354336)', diff saved to https://phabricator.wikimedia.org/P55173 and previous config saved to /var/cache/conftool/dbconfig/20240122-102158-marostegui.json
10:18 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2158.codfw.wmnet with OS bookworm
10:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2158', diff saved to https://phabricator.wikimedia.org/P55172 and previous config saved to /var/cache/conftool/dbconfig/20240122-101634-marostegui.json
10:13 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gerrit[1003,2002].wikimedia.org
10:13 cgoubert@cumin1002: START - Cookbook sre.hosts.remove-downtime for gerrit[1003,2002].wikimedia.org
10:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55171 and previous config saved to /var/cache/conftool/dbconfig/20240122-100722-root.json
10:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55170 and previous config saved to /var/cache/conftool/dbconfig/20240122-100707-root.json
10:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P55169 and previous config saved to /var/cache/conftool/dbconfig/20240122-100651-marostegui.json
10:04 hashar: gerrit: running jgit gc on every repository to regenerate potentially faulty reachability bitmaps files preventing fetches on some repositories # T355173
10:00 jelto: start envoy on ticket-test.wikimedia.org to test alerting - T354479
09:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2049.codfw.wmnet
09:56 jelto: stop envoy on ticket-test.wikimedia.org to test alerting - T354479
09:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2049.codfw.wmnet
09:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1049.eqiad.wmnet
09:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55167 and previous config saved to /var/cache/conftool/dbconfig/20240122-095217-root.json
09:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55166 and previous config saved to /var/cache/conftool/dbconfig/20240122-095202-root.json
09:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P55165 and previous config saved to /var/cache/conftool/dbconfig/20240122-095145-marostegui.json
09:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
09:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
09:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1049.eqiad.wmnet
09:38 hashar: Restarted Gerrit with upgraded version 3.7.6 # T354885
09:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55164 and previous config saved to /var/cache/conftool/dbconfig/20240122-093712-root.json
09:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55163 and previous config saved to /var/cache/conftool/dbconfig/20240122-093657-root.json
09:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T354336)', diff saved to https://phabricator.wikimedia.org/P55162 and previous config saved to /var/cache/conftool/dbconfig/20240122-093638-marostegui.json
09:26 cgoubert@cumin1002: conftool action : set/pooled=no; selector: name=mw2394.codfw.wmnet
09:26 cgoubert@cumin1002: conftool action : set/pooled=yes; selector: name=mw2444.codfw.wmnet
09:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3315 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55161 and previous config saved to /var/cache/conftool/dbconfig/20240122-092207-root.json
09:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1213:3316 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55160 and previous config saved to /var/cache/conftool/dbconfig/20240122-092152-root.json
09:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T354336)', diff saved to https://phabricator.wikimedia.org/P55159 and previous config saved to /var/cache/conftool/dbconfig/20240122-091916-marostegui.json
09:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
09:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
09:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2156.codfw.wmnet with reason: Maintenance
09:18 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
09:18 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1035.eqiad.wmnet to cluster eqiad and group A
09:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2156.codfw.wmnet with reason: Maintenance
09:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T354336)', diff saved to https://phabricator.wikimedia.org/P55158 and previous config saved to /var/cache/conftool/dbconfig/20240122-091838-marostegui.json
09:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1213.eqiad.wmnet with OS bookworm
09:17 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on gerrit[1003,2002].wikimedia.org with reason: Gerrit update
09:17 cgoubert@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on gerrit[1003,2002].wikimedia.org with reason: Gerrit update
09:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
09:11 hashar: Gerrit: reindexing all changes for 3.6 > 3.7 migration # T354885
09:08 hashar@deploy2002: Finished deploy [gerrit/gerrit@bdd1a8b]: Gerrit to version 3.7.6 (duration: 00m 10s)
09:08 hashar@deploy2002: Started deploy [gerrit/gerrit@bdd1a8b]: Gerrit to version 3.7.6
09:06 hashar: Upgrading Gerrit # T354885
09:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
09:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55157 and previous config saved to /var/cache/conftool/dbconfig/20240122-090504-root.json
09:03 cgoubert@cumin1002: conftool action : set/pooled=no; selector: name=mw2444.codfw.wmnet
09:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P55156 and previous config saved to /var/cache/conftool/dbconfig/20240122-090332-marostegui.json
09:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55155 and previous config saved to /var/cache/conftool/dbconfig/20240122-090218-root.json
09:01 cgoubert@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2394.codfw.wmnet
09:01 cgoubert@cumin1002: START - Cookbook sre.hosts.remove-downtime for mw2394.codfw.wmnet
08:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1213.eqiad.wmnet with reason: host reimage
08:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1213.eqiad.wmnet with reason: host reimage
08:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55154 and previous config saved to /var/cache/conftool/dbconfig/20240122-084959-root.json
08:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P55153 and previous config saved to /var/cache/conftool/dbconfig/20240122-084825-marostegui.json
08:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55152 and previous config saved to /var/cache/conftool/dbconfig/20240122-084713-root.json
08:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2048.codfw.wmnet
08:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1213.eqiad.wmnet with OS bookworm
08:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1213:3316 db1213:3315', diff saved to https://phabricator.wikimedia.org/P55151 and previous config saved to /var/cache/conftool/dbconfig/20240122-083812-marostegui.json
08:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2048.codfw.wmnet
08:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1048.eqiad.wmnet
08:35 xSavitar: UTC morning backport window done!
08:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55150 and previous config saved to /var/cache/conftool/dbconfig/20240122-083454-root.json
08:34 derick@deploy2002: Finished scap: Backport for wmf-config: Remove unused wgCentralAuthTokenCacheType (T336004) (duration: 18m 15s)
08:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T354336)', diff saved to https://phabricator.wikimedia.org/P55149 and previous config saved to /var/cache/conftool/dbconfig/20240122-083319-marostegui.json
08:32 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1048.eqiad.wmnet
08:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55148 and previous config saved to /var/cache/conftool/dbconfig/20240122-083208-root.json
08:27 derick@deploy2002: d3r1ck01 and derick: Continuing with sync
08:26 derick@deploy2002: d3r1ck01 and derick: Backport for wmf-config: Remove unused wgCentralAuthTokenCacheType (T336004) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55147 and previous config saved to /var/cache/conftool/dbconfig/20240122-081950-root.json
08:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55146 and previous config saved to /var/cache/conftool/dbconfig/20240122-081727-root.json
08:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55145 and previous config saved to /var/cache/conftool/dbconfig/20240122-081703-root.json
08:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T354336)', diff saved to https://phabricator.wikimedia.org/P55144 and previous config saved to /var/cache/conftool/dbconfig/20240122-081618-marostegui.json
08:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2149.codfw.wmnet with reason: Maintenance
08:15 derick@deploy2002: Started scap: Backport for wmf-config: Remove unused wgCentralAuthTokenCacheType (T336004)
08:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2149.codfw.wmnet with reason: Maintenance
08:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T354336)', diff saved to https://phabricator.wikimedia.org/P55143 and previous config saved to /var/cache/conftool/dbconfig/20240122-081545-marostegui.json
08:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55142 and previous config saved to /var/cache/conftool/dbconfig/20240122-080445-root.json
08:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55141 and previous config saved to /var/cache/conftool/dbconfig/20240122-080222-root.json
08:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55140 and previous config saved to /var/cache/conftool/dbconfig/20240122-080158-root.json
08:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P55139 and previous config saved to /var/cache/conftool/dbconfig/20240122-080038-marostegui.json
07:54 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Shubhankar Patankar out of all services on: 2208 hosts
07:53 root@cumin2002: START - Cookbook sre.idm.logout Logging Shubhankar Patankar out of all services on: 2208 hosts
07:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55138 and previous config saved to /var/cache/conftool/dbconfig/20240122-074940-root.json
07:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55137 and previous config saved to /var/cache/conftool/dbconfig/20240122-074717-root.json
07:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55136 and previous config saved to /var/cache/conftool/dbconfig/20240122-074653-root.json
07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P55135 and previous config saved to /var/cache/conftool/dbconfig/20240122-074532-marostegui.json
07:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2169.codfw.wmnet with OS bookworm
07:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55134 and previous config saved to /var/cache/conftool/dbconfig/20240122-073435-root.json
07:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55133 and previous config saved to /var/cache/conftool/dbconfig/20240122-073212-root.json
07:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55132 and previous config saved to /var/cache/conftool/dbconfig/20240122-073148-root.json
07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T354336)', diff saved to https://phabricator.wikimedia.org/P55131 and previous config saved to /var/cache/conftool/dbconfig/20240122-073025-marostegui.json
07:28 kart_: Updated MinT to 2024-01-22-053144-production (T355303, T338608, T353510, T354666)
07:20 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
07:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55130 and previous config saved to /var/cache/conftool/dbconfig/20240122-071707-root.json
07:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
07:13 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
07:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2127 (T354336)', diff saved to https://phabricator.wikimedia.org/P55129 and previous config saved to /var/cache/conftool/dbconfig/20240122-071117-marostegui.json
07:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
07:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2127.codfw.wmnet with reason: Maintenance
07:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2127.codfw.wmnet with reason: Maintenance
07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T354336)', diff saved to https://phabricator.wikimedia.org/P55128 and previous config saved to /var/cache/conftool/dbconfig/20240122-071054-marostegui.json
07:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55127 and previous config saved to /var/cache/conftool/dbconfig/20240122-070202-root.json
07:02 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P55126 and previous config saved to /var/cache/conftool/dbconfig/20240122-065548-marostegui.json
06:55 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
06:52 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2169.codfw.wmnet with OS bookworm
06:52 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
06:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2169:3316 db2169:3317', diff saved to https://phabricator.wikimedia.org/P55125 and previous config saved to /var/cache/conftool/dbconfig/20240122-064929-marostegui.json
06:47 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
06:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1187 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P55124 and previous config saved to /var/cache/conftool/dbconfig/20240122-064657-root.json
06:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1187.eqiad.wmnet with OS bookworm
06:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P55123 and previous config saved to /var/cache/conftool/dbconfig/20240122-064041-marostegui.json
06:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
06:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T354336)', diff saved to https://phabricator.wikimedia.org/P55122 and previous config saved to /var/cache/conftool/dbconfig/20240122-062535-marostegui.json
06:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
06:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1187.eqiad.wmnet with OS bookworm
06:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1187 T354506', diff saved to https://phabricator.wikimedia.org/P55121 and previous config saved to /var/cache/conftool/dbconfig/20240122-060811-marostegui.json
06:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2109 (T354336)', diff saved to https://phabricator.wikimedia.org/P55120 and previous config saved to /var/cache/conftool/dbconfig/20240122-060529-marostegui.json
06:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2109.codfw.wmnet with reason: Maintenance
06:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2109.codfw.wmnet with reason: Maintenance
06:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1157.eqiad.wmnet with reason: Maintenance
06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1157.eqiad.wmnet with reason: Maintenance
05:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
05:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55119 and previous config saved to /var/cache/conftool/dbconfig/20240122-054005-ladsgroup.json
05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55118 and previous config saved to /var/cache/conftool/dbconfig/20240122-052458-ladsgroup.json
05:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P55117 and previous config saved to /var/cache/conftool/dbconfig/20240122-050952-ladsgroup.json
04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55116 and previous config saved to /var/cache/conftool/dbconfig/20240122-045445-ladsgroup.json

2024-01-21

23:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55115 and previous config saved to /var/cache/conftool/dbconfig/20240121-232323-ladsgroup.json
23:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
23:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
23:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55114 and previous config saved to /var/cache/conftool/dbconfig/20240121-232300-ladsgroup.json
23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55113 and previous config saved to /var/cache/conftool/dbconfig/20240121-230754-ladsgroup.json
22:55 tgr: T355491 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=dawiki --logwiki=metawiki 'Radiocolono' 'GuaritaRM'
22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P55112 and previous config saved to /var/cache/conftool/dbconfig/20240121-225247-ladsgroup.json
22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55111 and previous config saved to /var/cache/conftool/dbconfig/20240121-223740-ladsgroup.json
17:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P55110 and previous config saved to /var/cache/conftool/dbconfig/20240121-171534-ladsgroup.json
17:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
17:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
17:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P55109 and previous config saved to /var/cache/conftool/dbconfig/20240121-171512-ladsgroup.json
17:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P55108 and previous config saved to /var/cache/conftool/dbconfig/20240121-170005-ladsgroup.json
16:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P55107 and previous config saved to /var/cache/conftool/dbconfig/20240121-164459-ladsgroup.json
16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P55106 and previous config saved to /var/cache/conftool/dbconfig/20240121-162952-ladsgroup.json
11:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P55105 and previous config saved to /var/cache/conftool/dbconfig/20240121-110344-ladsgroup.json
11:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
11:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
11:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55104 and previous config saved to /var/cache/conftool/dbconfig/20240121-110322-ladsgroup.json
10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55103 and previous config saved to /var/cache/conftool/dbconfig/20240121-104815-ladsgroup.json
10:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P55102 and previous config saved to /var/cache/conftool/dbconfig/20240121-103309-ladsgroup.json
10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55101 and previous config saved to /var/cache/conftool/dbconfig/20240121-101802-ladsgroup.json
09:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P55100 and previous config saved to /var/cache/conftool/dbconfig/20240121-091731-ladsgroup.json
09:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
09:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
09:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P55099 and previous config saved to /var/cache/conftool/dbconfig/20240121-091708-ladsgroup.json
09:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2175', diff saved to https://phabricator.wikimedia.org/P55098 and previous config saved to /var/cache/conftool/dbconfig/20240121-090831-marostegui.json
09:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55097 and previous config saved to /var/cache/conftool/dbconfig/20240121-090202-ladsgroup.json
08:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P55096 and previous config saved to /var/cache/conftool/dbconfig/20240121-084655-ladsgroup.json
08:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P55095 and previous config saved to /var/cache/conftool/dbconfig/20240121-083148-ladsgroup.json
02:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P55094 and previous config saved to /var/cache/conftool/dbconfig/20240121-024507-ladsgroup.json
02:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
02:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
02:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P55093 and previous config saved to /var/cache/conftool/dbconfig/20240121-024445-ladsgroup.json
02:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55092 and previous config saved to /var/cache/conftool/dbconfig/20240121-022939-ladsgroup.json
02:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P55091 and previous config saved to /var/cache/conftool/dbconfig/20240121-021432-ladsgroup.json
01:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P55090 and previous config saved to /var/cache/conftool/dbconfig/20240121-015926-ladsgroup.json
00:29 mutante: phabricator is back and on bullseye
00:11 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004 (duration: 00m 13s)
00:11 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004
00:03 mutante: phab1004:/usr/bin# ln -s /var/lib/scap/scap/bin/scap .
00:00 brennen@deploy2002: Installation of scap version "latest" completed for 1 hosts
00:00 brennen@deploy2002: Installing scap version "latest" for 1 hosts

2024-01-20

23:58 mutante: phab1004 - chown -R scap:scap /var/lib/scap
23:10 brennen@deploy2002: Installing scap version "latest" for 1 hosts
22:45 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004 (duration: 00m 10s)
22:44 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004
22:39 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004 (duration: 00m 10s)
22:39 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: initial deploy to re-imaged phab1004
22:34 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: deployment
22:34 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: deployment
22:28 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert (part 2) (duration: 00m 54s)
22:27 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert (part 2)
22:23 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert (duration: 00m 55s)
22:22 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config revert
22:02 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
22:02 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
22:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab.wmfusercontent.org with reason: OS upgrade
22:02 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab.wmfusercontent.org with reason: OS upgrade
22:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host phab1004.eqiad.wmnet with OS bullseye
22:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
22:01 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
21:46 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1004.eqiad.wmnet with reason: host reimage
21:43 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1004.eqiad.wmnet with reason: host reimage
21:33 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
21:33 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
21:31 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host phab1004.eqiad.wmnet with OS bullseye
21:27 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host phab1004.eqiad.wmnet with OS bullseye
21:27 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host phab1004.eqiad.wmnet with OS bullseye
21:03 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config changes (redux) (duration: 01m 35s)
21:02 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up db config changes (redux)
20:38 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: maintenance
20:38 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: maintenance
20:37 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up database changes (duration: 00m 53s)
20:36 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 to pick up database changes
20:32 mutante: phabricator going down for maintenance
20:24 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab.wmfusercontent.org with reason: OS upgrade
20:23 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
20:23 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phabricator.wikimedia.org with reason: OS upgrade
20:22 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
20:22 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1004.eqiad.wmnet with reason: OS upgrade
20:04 brennen: start of phab/phorge bullseye update window - T334519
20:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P55089 and previous config saved to /var/cache/conftool/dbconfig/20240120-200154-ladsgroup.json
20:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
20:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
14:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
14:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P55087 and previous config saved to /var/cache/conftool/dbconfig/20240120-095311-ladsgroup.json
09:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55086 and previous config saved to /var/cache/conftool/dbconfig/20240120-093804-ladsgroup.json
09:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P55085 and previous config saved to /var/cache/conftool/dbconfig/20240120-092257-ladsgroup.json
09:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P55084 and previous config saved to /var/cache/conftool/dbconfig/20240120-090751-ladsgroup.json
04:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P55083 and previous config saved to /var/cache/conftool/dbconfig/20240120-041124-ladsgroup.json
04:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
04:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
04:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P55082 and previous config saved to /var/cache/conftool/dbconfig/20240120-041102-ladsgroup.json
03:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55081 and previous config saved to /var/cache/conftool/dbconfig/20240120-035555-ladsgroup.json
03:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P55080 and previous config saved to /var/cache/conftool/dbconfig/20240120-034049-ladsgroup.json
03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P55079 and previous config saved to /var/cache/conftool/dbconfig/20240120-032542-ladsgroup.json

2024-01-19

22:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P55078 and previous config saved to /var/cache/conftool/dbconfig/20240119-225906-ladsgroup.json
22:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
22:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
22:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P55077 and previous config saved to /var/cache/conftool/dbconfig/20240119-225844-ladsgroup.json
22:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55076 and previous config saved to /var/cache/conftool/dbconfig/20240119-224337-ladsgroup.json
22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P55075 and previous config saved to /var/cache/conftool/dbconfig/20240119-222830-ladsgroup.json
22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P55074 and previous config saved to /var/cache/conftool/dbconfig/20240119-221324-ladsgroup.json
22:05 ryankemper: [WDQS] Repooled `wdqs10[19,20]` (caught up on lag)
20:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
20:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
20:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T354336)', diff saved to https://phabricator.wikimedia.org/P55073 and previous config saved to /var/cache/conftool/dbconfig/20240119-202129-marostegui.json
20:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P55072 and previous config saved to /var/cache/conftool/dbconfig/20240119-200622-marostegui.json
19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P55071 and previous config saved to /var/cache/conftool/dbconfig/20240119-195116-marostegui.json
19:45 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
19:43 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
19:38 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T354336)', diff saved to https://phabricator.wikimedia.org/P55070 and previous config saved to /var/cache/conftool/dbconfig/20240119-193610-marostegui.json
19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1223 (T354336)', diff saved to https://phabricator.wikimedia.org/P55069 and previous config saved to /var/cache/conftool/dbconfig/20240119-193028-marostegui.json
19:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1223.eqiad.wmnet with reason: Maintenance
19:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1223.eqiad.wmnet with reason: Maintenance
19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T354336)', diff saved to https://phabricator.wikimedia.org/P55068 and previous config saved to /var/cache/conftool/dbconfig/20240119-193006-marostegui.json
19:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P55067 and previous config saved to /var/cache/conftool/dbconfig/20240119-191459-marostegui.json
18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P55066 and previous config saved to /var/cache/conftool/dbconfig/20240119-185953-marostegui.json
18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T354336)', diff saved to https://phabricator.wikimedia.org/P55065 and previous config saved to /var/cache/conftool/dbconfig/20240119-184446-marostegui.json
18:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T354336)', diff saved to https://phabricator.wikimedia.org/P55064 and previous config saved to /var/cache/conftool/dbconfig/20240119-183902-marostegui.json
18:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
18:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
18:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1212.eqiad.wmnet with reason: Maintenance
18:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1212.eqiad.wmnet with reason: Maintenance
18:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T354336)', diff saved to https://phabricator.wikimedia.org/P55063 and previous config saved to /var/cache/conftool/dbconfig/20240119-183821-marostegui.json
18:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
18:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P55062 and previous config saved to /var/cache/conftool/dbconfig/20240119-182314-marostegui.json
18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P55061 and previous config saved to /var/cache/conftool/dbconfig/20240119-180808-marostegui.json
18:02 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T354336)', diff saved to https://phabricator.wikimedia.org/P55060 and previous config saved to /var/cache/conftool/dbconfig/20240119-175301-marostegui.json
17:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T354336)', diff saved to https://phabricator.wikimedia.org/P55059 and previous config saved to /var/cache/conftool/dbconfig/20240119-174735-marostegui.json
17:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1198.eqiad.wmnet with reason: Maintenance
17:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1198.eqiad.wmnet with reason: Maintenance
17:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T354336)', diff saved to https://phabricator.wikimedia.org/P55058 and previous config saved to /var/cache/conftool/dbconfig/20240119-174713-marostegui.json
17:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P55057 and previous config saved to /var/cache/conftool/dbconfig/20240119-173207-marostegui.json
17:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P55056 and previous config saved to /var/cache/conftool/dbconfig/20240119-172715-ladsgroup.json
17:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
17:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P55055 and previous config saved to /var/cache/conftool/dbconfig/20240119-172652-ladsgroup.json
17:25 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cloudelastic1010.wikimedia.org with reason: need to fix regex certs
17:25 bking@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on cloudelastic1010.wikimedia.org with reason: need to fix regex certs
17:23 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1010.wikimedia.org
17:23 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1009.wikimedia.org
17:23 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1008.wikimedia.org
17:22 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1007.wikimedia.org
17:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P55054 and previous config saved to /var/cache/conftool/dbconfig/20240119-171700-marostegui.json
17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55053 and previous config saved to /var/cache/conftool/dbconfig/20240119-171146-ladsgroup.json
17:06 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
17:04 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host elastic2088.codfw.wmnet with OS bullseye
17:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T354336)', diff saved to https://phabricator.wikimedia.org/P55052 and previous config saved to /var/cache/conftool/dbconfig/20240119-170154-marostegui.json
16:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P55051 and previous config saved to /var/cache/conftool/dbconfig/20240119-165639-ladsgroup.json
16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T354336)', diff saved to https://phabricator.wikimedia.org/P55050 and previous config saved to /var/cache/conftool/dbconfig/20240119-165627-marostegui.json
16:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1189.eqiad.wmnet with reason: Maintenance
16:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1189.eqiad.wmnet with reason: Maintenance
16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T354336)', diff saved to https://phabricator.wikimedia.org/P55049 and previous config saved to /var/cache/conftool/dbconfig/20240119-165605-marostegui.json
16:41 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
16:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P55048 and previous config saved to /var/cache/conftool/dbconfig/20240119-164133-ladsgroup.json
16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P55047 and previous config saved to /var/cache/conftool/dbconfig/20240119-164058-marostegui.json
16:38 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
16:31 Emperor: mark new drive as non-RAID, mount, restore to service with puppet ms-be2072 T355330
16:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P55046 and previous config saved to /var/cache/conftool/dbconfig/20240119-162552-marostegui.json
16:16 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
16:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T354336)', diff saved to https://phabricator.wikimedia.org/P55045 and previous config saved to /var/cache/conftool/dbconfig/20240119-161046-marostegui.json
16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T354336)', diff saved to https://phabricator.wikimedia.org/P55044 and previous config saved to /var/cache/conftool/dbconfig/20240119-160521-marostegui.json
16:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance
16:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance
16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55043 and previous config saved to /var/cache/conftool/dbconfig/20240119-160459-marostegui.json
15:57 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
15:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P55042 and previous config saved to /var/cache/conftool/dbconfig/20240119-154953-marostegui.json
15:46 gmodena@deploy2002: Finished deploy [airflow-dags/analytics@f32c06e]: (no justification provided) (duration: 00m 30s)
15:46 gmodena@deploy2002: Started deploy [airflow-dags/analytics@f32c06e]: (no justification provided)
15:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P55041 and previous config saved to /var/cache/conftool/dbconfig/20240119-153446-marostegui.json
15:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55040 and previous config saved to /var/cache/conftool/dbconfig/20240119-151940-marostegui.json
15:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T354336)', diff saved to https://phabricator.wikimedia.org/P55039 and previous config saved to /var/cache/conftool/dbconfig/20240119-151413-marostegui.json
15:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
15:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
15:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
15:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
15:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
15:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
15:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2118.codfw.wmnet with reason: Maintenance
15:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2118.codfw.wmnet with reason: Maintenance
14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T354336)', diff saved to https://phabricator.wikimedia.org/P55038 and previous config saved to /var/cache/conftool/dbconfig/20240119-145930-marostegui.json
14:56 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
14:50 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1107.eqiad.wmnet with OS bullseye
14:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P55036 and previous config saved to /var/cache/conftool/dbconfig/20240119-144423-marostegui.json
14:37 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1103.eqiad.wmnet with OS bullseye
14:35 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
14:34 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
14:34 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
14:34 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic2088.codfw.wmnet']
14:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1107.eqiad.wmnet with reason: host reimage
14:31 bking@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['elastic2088.codfw.wmnet']
14:29 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1107.eqiad.wmnet with reason: host reimage
14:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P55034 and previous config saved to /var/cache/conftool/dbconfig/20240119-142917-marostegui.json
14:27 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
14:27 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
14:24 ejegg: payments-wiki upgraded from c37ddae5 to c2138768
14:21 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
14:21 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
14:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1103.eqiad.wmnet with reason: host reimage
14:17 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1103.eqiad.wmnet with reason: host reimage
14:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T354336)', diff saved to https://phabricator.wikimedia.org/P55033 and previous config saved to /var/cache/conftool/dbconfig/20240119-141411-marostegui.json
14:13 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1107.eqiad.wmnet with OS bullseye
14:12 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
14:12 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
14:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T354336)', diff saved to https://phabricator.wikimedia.org/P55032 and previous config saved to /var/cache/conftool/dbconfig/20240119-140746-marostegui.json
14:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
14:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
14:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55031 and previous config saved to /var/cache/conftool/dbconfig/20240119-140712-marostegui.json
14:07 gmodena@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
14:06 gmodena@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
14:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1103.eqiad.wmnet with OS bullseye
13:58 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
13:57 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P55030 and previous config saved to /var/cache/conftool/dbconfig/20240119-135206-marostegui.json
13:46 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
13:46 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
13:43 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
13:38 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2046.codfw.wmnet
13:38 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1046.eqiad.wmnet
13:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P55029 and previous config saved to /var/cache/conftool/dbconfig/20240119-133659-marostegui.json
13:32 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2046.codfw.wmnet
13:32 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1046.eqiad.wmnet
13:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55028 and previous config saved to /var/cache/conftool/dbconfig/20240119-132153-marostegui.json
13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2169:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55027 and previous config saved to /var/cache/conftool/dbconfig/20240119-131929-marostegui.json
13:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
13:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55026 and previous config saved to /var/cache/conftool/dbconfig/20240119-131906-marostegui.json
13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P55024 and previous config saved to /var/cache/conftool/dbconfig/20240119-130400-marostegui.json
12:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P55023 and previous config saved to /var/cache/conftool/dbconfig/20240119-124853-marostegui.json
12:45 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
12:44 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
12:44 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
12:43 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
12:42 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
12:41 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
12:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55022 and previous config saved to /var/cache/conftool/dbconfig/20240119-123347-marostegui.json
12:32 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
12:32 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
12:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P55021 and previous config saved to /var/cache/conftool/dbconfig/20240119-123023-marostegui.json
12:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
12:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
12:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T354336)', diff saved to https://phabricator.wikimedia.org/P55020 and previous config saved to /var/cache/conftool/dbconfig/20240119-123001-marostegui.json
12:30 gmodena@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
12:29 gmodena@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P55019 and previous config saved to /var/cache/conftool/dbconfig/20240119-121455-marostegui.json
11:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P55018 and previous config saved to /var/cache/conftool/dbconfig/20240119-115948-marostegui.json
11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1243 (T352010)', diff saved to https://phabricator.wikimedia.org/P55017 and previous config saved to /var/cache/conftool/dbconfig/20240119-114452-ladsgroup.json
11:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T354336)', diff saved to https://phabricator.wikimedia.org/P55016 and previous config saved to /var/cache/conftool/dbconfig/20240119-114442-marostegui.json
11:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
11:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P55015 and previous config saved to /var/cache/conftool/dbconfig/20240119-114424-ladsgroup.json
11:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T354336)', diff saved to https://phabricator.wikimedia.org/P55014 and previous config saved to /var/cache/conftool/dbconfig/20240119-114219-marostegui.json
11:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
11:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
11:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
11:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T354336)', diff saved to https://phabricator.wikimedia.org/P55013 and previous config saved to /var/cache/conftool/dbconfig/20240119-114140-marostegui.json
11:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55012 and previous config saved to /var/cache/conftool/dbconfig/20240119-112917-ladsgroup.json
11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P55011 and previous config saved to /var/cache/conftool/dbconfig/20240119-112634-marostegui.json
11:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P55010 and previous config saved to /var/cache/conftool/dbconfig/20240119-111411-ladsgroup.json
11:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P55009 and previous config saved to /var/cache/conftool/dbconfig/20240119-111127-marostegui.json
10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P55008 and previous config saved to /var/cache/conftool/dbconfig/20240119-105904-ladsgroup.json
10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T354336)', diff saved to https://phabricator.wikimedia.org/P55007 and previous config saved to /var/cache/conftool/dbconfig/20240119-105621-marostegui.json
10:45 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
10:42 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
10:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T354336)', diff saved to https://phabricator.wikimedia.org/P55006 and previous config saved to /var/cache/conftool/dbconfig/20240119-101340-marostegui.json
10:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
10:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
10:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T354336)', diff saved to https://phabricator.wikimedia.org/P55005 and previous config saved to /var/cache/conftool/dbconfig/20240119-101318-marostegui.json
09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P55004 and previous config saved to /var/cache/conftool/dbconfig/20240119-095811-marostegui.json
09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P55003 and previous config saved to /var/cache/conftool/dbconfig/20240119-094305-marostegui.json
09:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T354336)', diff saved to https://phabricator.wikimedia.org/P55002 and previous config saved to /var/cache/conftool/dbconfig/20240119-092758-marostegui.json
09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T354336)', diff saved to https://phabricator.wikimedia.org/P55001 and previous config saved to /var/cache/conftool/dbconfig/20240119-092535-marostegui.json
09:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
09:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
09:25 jnuche@deploy2002: Installation of scap version "4.65.2" completed for 531 hosts
09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T354336)', diff saved to https://phabricator.wikimedia.org/P55000 and previous config saved to /var/cache/conftool/dbconfig/20240119-092513-marostegui.json
09:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore2006.codfw.wmnet
09:24 jnuche@deploy2002: Installing scap version "4.65.2" for 531 hosts
09:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore2006.codfw.wmnet
09:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore2005.codfw.wmnet
09:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P54999 and previous config saved to /var/cache/conftool/dbconfig/20240119-091007-marostegui.json
09:03 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore2005.codfw.wmnet
09:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore2004.codfw.wmnet
08:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P54998 and previous config saved to /var/cache/conftool/dbconfig/20240119-085500-marostegui.json
08:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore2004.codfw.wmnet
08:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore1006.eqiad.wmnet
08:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T354336)', diff saved to https://phabricator.wikimedia.org/P54997 and previous config saved to /var/cache/conftool/dbconfig/20240119-083954-marostegui.json
08:39 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore1006.eqiad.wmnet
08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T354336)', diff saved to https://phabricator.wikimedia.org/P54996 and previous config saved to /var/cache/conftool/dbconfig/20240119-083730-marostegui.json
08:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
08:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T354336)', diff saved to https://phabricator.wikimedia.org/P54995 and previous config saved to /var/cache/conftool/dbconfig/20240119-083709-marostegui.json
08:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore1005.eqiad.wmnet
08:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore1005.eqiad.wmnet
08:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P54994 and previous config saved to /var/cache/conftool/dbconfig/20240119-082202-marostegui.json
08:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host sessionstore1004.eqiad.wmnet
08:11 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host sessionstore1004.eqiad.wmnet
08:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P54993 and previous config saved to /var/cache/conftool/dbconfig/20240119-080655-marostegui.json
07:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 100%: T354336', diff saved to https://phabricator.wikimedia.org/P54992 and previous config saved to /var/cache/conftool/dbconfig/20240119-075828-root.json
07:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T354336)', diff saved to https://phabricator.wikimedia.org/P54991 and previous config saved to /var/cache/conftool/dbconfig/20240119-075149-marostegui.json
07:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2120 (T354336)', diff saved to https://phabricator.wikimedia.org/P54990 and previous config saved to /var/cache/conftool/dbconfig/20240119-074825-marostegui.json
07:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2120.codfw.wmnet with reason: Maintenance
07:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2120.codfw.wmnet with reason: Maintenance
07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T354336)', diff saved to https://phabricator.wikimedia.org/P54989 and previous config saved to /var/cache/conftool/dbconfig/20240119-074752-marostegui.json
07:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 75%: T354336', diff saved to https://phabricator.wikimedia.org/P54988 and previous config saved to /var/cache/conftool/dbconfig/20240119-074323-root.json
07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P54987 and previous config saved to /var/cache/conftool/dbconfig/20240119-073245-marostegui.json
07:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 50%: T354336', diff saved to https://phabricator.wikimedia.org/P54986 and previous config saved to /var/cache/conftool/dbconfig/20240119-072818-root.json
07:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P54985 and previous config saved to /var/cache/conftool/dbconfig/20240119-071739-marostegui.json
07:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 25%: T354336', diff saved to https://phabricator.wikimedia.org/P54984 and previous config saved to /var/cache/conftool/dbconfig/20240119-071313-root.json
07:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T354336)', diff saved to https://phabricator.wikimedia.org/P54983 and previous config saved to /var/cache/conftool/dbconfig/20240119-070233-marostegui.json
07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2108 (T354336)', diff saved to https://phabricator.wikimedia.org/P54982 and previous config saved to /var/cache/conftool/dbconfig/20240119-070009-marostegui.json
07:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2108.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2108.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
06:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
06:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 10%: T354336', diff saved to https://phabricator.wikimedia.org/P54981 and previous config saved to /var/cache/conftool/dbconfig/20240119-065808-root.json
06:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
06:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
06:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
06:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
06:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54979 and previous config saved to /var/cache/conftool/dbconfig/20240119-063020-marostegui.json
06:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
06:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
06:28 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
06:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
06:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1242 (T352010)', diff saved to https://phabricator.wikimedia.org/P54978 and previous config saved to /var/cache/conftool/dbconfig/20240119-061827-ladsgroup.json
06:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
06:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
06:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P54977 and previous config saved to /var/cache/conftool/dbconfig/20240119-061805-ladsgroup.json
06:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P54976 and previous config saved to /var/cache/conftool/dbconfig/20240119-060258-ladsgroup.json
05:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P54975 and previous config saved to /var/cache/conftool/dbconfig/20240119-054751-ladsgroup.json
05:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P54974 and previous config saved to /var/cache/conftool/dbconfig/20240119-053244-ladsgroup.json
03:38 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
02:49 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1103.eqiad.wmnet with OS bullseye
02:48 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1106.eqiad.wmnet with OS bullseye
02:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1105.eqiad.wmnet with OS bullseye
02:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1104.eqiad.wmnet with OS bullseye
02:31 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1106.eqiad.wmnet with reason: host reimage
02:28 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1106.eqiad.wmnet with reason: host reimage
02:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1105.eqiad.wmnet with reason: host reimage
02:24 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1105.eqiad.wmnet with reason: host reimage
02:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1104.eqiad.wmnet with reason: host reimage
02:21 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1104.eqiad.wmnet with reason: host reimage
02:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
02:17 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
02:12 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1106.eqiad.wmnet with OS bullseye
02:09 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1105.eqiad.wmnet with OS bullseye
02:09 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
02:06 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1104.eqiad.wmnet with OS bullseye
02:01 tzatziki: removing 4 files for legal compliance
01:42 tzatziki: removing 3 files for legal compliance
01:28 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic1103.eqiad.wmnet with OS bullseye
01:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2097.codfw.wmnet with OS bullseye
01:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2096.codfw.wmnet with OS bullseye
00:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
00:50 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2097.codfw.wmnet with reason: host reimage
00:50 tzatziki: removing 1 file for legal compliance
00:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
00:47 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2097.codfw.wmnet with reason: host reimage
00:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2096.codfw.wmnet with reason: host reimage
00:43 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2096.codfw.wmnet with reason: host reimage
00:42 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2101.codfw.wmnet with OS bullseye
00:40 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2100.codfw.wmnet with OS bullseye
00:34 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2099.codfw.wmnet with OS bullseye
00:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2097.codfw.wmnet with OS bullseye
00:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1241 (T352010)', diff saved to https://phabricator.wikimedia.org/P54973 and previous config saved to /var/cache/conftool/dbconfig/20240119-002755-ladsgroup.json
00:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
00:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
00:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T352010)', diff saved to https://phabricator.wikimedia.org/P54972 and previous config saved to /var/cache/conftool/dbconfig/20240119-002733-ladsgroup.json
00:26 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2096.codfw.wmnet with OS bullseye
00:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2098.codfw.wmnet with OS bullseye
00:25 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2101.codfw.wmnet with reason: host reimage
00:22 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2100.codfw.wmnet with reason: host reimage
00:21 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2101.codfw.wmnet with reason: host reimage
00:18 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2100.codfw.wmnet with reason: host reimage
00:17 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2099.codfw.wmnet with reason: host reimage
00:14 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2099.codfw.wmnet with reason: host reimage
00:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1020.eqiad.wmnet with reason: needs to catch up from its lag
00:13 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs1020.eqiad.wmnet with reason: needs to catch up from its lag
00:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P54971 and previous config saved to /var/cache/conftool/dbconfig/20240119-001226-ladsgroup.json
00:12 inflatador: bking@wdqs1020 depool host to catch up on lag
00:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2098.codfw.wmnet with reason: host reimage
00:05 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2098.codfw.wmnet with reason: host reimage
00:05 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2101.codfw.wmnet with OS bullseye
00:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2100.codfw.wmnet with OS bullseye

2024-01-18

23:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2099.codfw.wmnet with OS bullseye
23:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P54970 and previous config saved to /var/cache/conftool/dbconfig/20240118-235720-ladsgroup.json
23:50 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
23:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2098.codfw.wmnet with OS bullseye
23:47 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
23:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T352010)', diff saved to https://phabricator.wikimedia.org/P54969 and previous config saved to /var/cache/conftool/dbconfig/20240118-234213-ladsgroup.json
23:13 tstarling@deploy2002: Synchronized php-1.42.0-wmf.14/extensions/CodeMirror/resources/mode/mediawiki/mediawiki.less: fix CodeMirror style bug T355290 (duration: 06m 33s)
22:59 bking@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host elastic2086.codfw.wmnet
22:55 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host elastic2086.codfw.wmnet
22:55 bking@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host elastic2086*
22:54 bking@cumin2002: START - Cookbook sre.puppet.migrate-host for host elastic2086*
22:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
22:00 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
21:59 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
21:57 urbanecm@deploy2002: Finished scap: Backport for Use BetaFeatures::isFeatureEnabled instead of getOption (T354288) (duration: 06m 58s)
21:50 urbanecm@deploy2002: Started scap: Backport for Use BetaFeatures::isFeatureEnabled instead of getOption (T354288)
21:41 jforrester@deploy2002: Finished scap: Backport for Promote wikimaniawiki to Vector 2022 as default skin (T355297) (duration: 07m 33s)
21:35 jforrester@deploy2002: jforrester and msz2001: Continuing with sync
21:35 jforrester@deploy2002: jforrester and msz2001: Backport for Promote wikimaniawiki to Vector 2022 as default skin (T355297) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:34 jforrester@deploy2002: Started scap: Backport for Promote wikimaniawiki to Vector 2022 as default skin (T355297)
21:15 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt`
21:14 dreamyjazz@deploy2002: Finished scap: Backport for Log to statsd HTTP status codes and reduce logstash log levels (T355216) (duration: 09m 00s)
21:14 Dreamy_Jazz: Stopped MediaModeration scanning script (T351400)
21:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
21:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
21:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T354336)', diff saved to https://phabricator.wikimedia.org/P54968 and previous config saved to /var/cache/conftool/dbconfig/20240118-211337-marostegui.json
21:08 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
21:08 dreamyjazz@deploy2002: dreamyjazz: Backport for Log to statsd HTTP status codes and reduce logstash log levels (T355216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:05 dreamyjazz@deploy2002: Started scap: Backport for Log to statsd HTTP status codes and reduce logstash log levels (T355216)
21:04 ejegg: payments-wiki upgraded from e38b24f0 to c37ddae5
20:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P54967 and previous config saved to /var/cache/conftool/dbconfig/20240118-205830-marostegui.json
20:44 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
20:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P54966 and previous config saved to /var/cache/conftool/dbconfig/20240118-204324-marostegui.json
20:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T354336)', diff saved to https://phabricator.wikimedia.org/P54965 and previous config saved to /var/cache/conftool/dbconfig/20240118-202817-marostegui.json
20:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T354336)', diff saved to https://phabricator.wikimedia.org/P54964 and previous config saved to /var/cache/conftool/dbconfig/20240118-202606-marostegui.json
20:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1236.eqiad.wmnet with reason: Maintenance
20:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1236.eqiad.wmnet with reason: Maintenance
20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54963 and previous config saved to /var/cache/conftool/dbconfig/20240118-202544-marostegui.json
20:24 mutante: rsyncing phab repo data, gitlab2003 pulls from phab2002 (inactive server) - test only to see how long it will take, can be stopped
20:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P54962 and previous config saved to /var/cache/conftool/dbconfig/20240118-201037-marostegui.json
20:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2095.codfw.wmnet with OS bullseye
19:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P54961 and previous config saved to /var/cache/conftool/dbconfig/20240118-195531-marostegui.json
19:48 ryankemper: T354662 Running `sudo -i authdns-update` on `dns1004` following merge of https://gerrit.wikimedia.org/r/c/operations/dns/+/991429
19:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2095.codfw.wmnet with reason: host reimage
19:43 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2095.codfw.wmnet with reason: host reimage
19:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54960 and previous config saved to /var/cache/conftool/dbconfig/20240118-194024-marostegui.json
19:26 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2095.codfw.wmnet with OS bullseye
19:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2093.codfw.wmnet with OS bullseye
19:23 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
19:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2092.codfw.wmnet with OS bullseye
19:11 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2091.codfw.wmnet with OS bullseye
19:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2093.codfw.wmnet with reason: host reimage
19:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2089.codfw.wmnet with OS bullseye
19:04 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2093.codfw.wmnet with reason: host reimage
19:02 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2092.codfw.wmnet with reason: host reimage
18:59 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2092.codfw.wmnet with reason: host reimage
18:54 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2091.codfw.wmnet with reason: host reimage
18:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2091.codfw.wmnet with reason: host reimage
18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1238 (T352010)', diff saved to https://phabricator.wikimedia.org/P54959 and previous config saved to /var/cache/conftool/dbconfig/20240118-185038-ladsgroup.json
18:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
18:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P54958 and previous config saved to /var/cache/conftool/dbconfig/20240118-185016-ladsgroup.json
18:48 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2089.codfw.wmnet with reason: host reimage
18:47 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2093.codfw.wmnet with OS bullseye
18:45 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2089.codfw.wmnet with reason: host reimage
18:42 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2092.codfw.wmnet with OS bullseye
18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T354336)', diff saved to https://phabricator.wikimedia.org/P54957 and previous config saved to /var/cache/conftool/dbconfig/20240118-184002-marostegui.json
18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
18:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T354336)', diff saved to https://phabricator.wikimedia.org/P54956 and previous config saved to /var/cache/conftool/dbconfig/20240118-183940-marostegui.json
18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P54955 and previous config saved to /var/cache/conftool/dbconfig/20240118-183510-ladsgroup.json
18:34 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2091.codfw.wmnet with OS bullseye
18:28 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2089.codfw.wmnet with OS bullseye
18:25 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
18:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P54954 and previous config saved to /var/cache/conftool/dbconfig/20240118-182433-marostegui.json
18:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P54953 and previous config saved to /var/cache/conftool/dbconfig/20240118-182003-ladsgroup.json
18:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P54951 and previous config saved to /var/cache/conftool/dbconfig/20240118-180927-marostegui.json
18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P54950 and previous config saved to /var/cache/conftool/dbconfig/20240118-180456-ladsgroup.json
17:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T354336)', diff saved to https://phabricator.wikimedia.org/P54949 and previous config saved to /var/cache/conftool/dbconfig/20240118-175420-marostegui.json
17:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T354336)', diff saved to https://phabricator.wikimedia.org/P54948 and previous config saved to /var/cache/conftool/dbconfig/20240118-175209-marostegui.json
17:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
17:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T354336)', diff saved to https://phabricator.wikimedia.org/P54947 and previous config saved to /var/cache/conftool/dbconfig/20240118-175147-marostegui.json
17:43 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2097.codfw.wmnet with OS bullseye
17:42 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2101.codfw.wmnet with OS bullseye
17:39 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2096.codfw.wmnet with OS bullseye
17:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P54946 and previous config saved to /var/cache/conftool/dbconfig/20240118-173640-marostegui.json
17:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2095.codfw.wmnet with OS bullseye
17:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2100.codfw.wmnet with OS bullseye
17:33 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2094.codfw.wmnet with OS bullseye
17:31 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2102.codfw.wmnet with OS bullseye
17:30 topranks: Re-enabling PyBal on lvs2011 after network migration T352912
17:30 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2093.codfw.wmnet with OS bullseye
17:28 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2099.codfw.wmnet with OS bullseye
17:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2092.codfw.wmnet with OS bullseye
17:25 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2091.codfw.wmnet with OS bullseye
17:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P54945 and previous config saved to /var/cache/conftool/dbconfig/20240118-172134-marostegui.json
17:20 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2098.codfw.wmnet with OS bullseye
17:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2102.codfw.wmnet with reason: host reimage
17:11 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2102.codfw.wmnet with reason: host reimage
17:11 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2089.codfw.wmnet with OS bullseye
17:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T354336)', diff saved to https://phabricator.wikimedia.org/P54944 and previous config saved to /var/cache/conftool/dbconfig/20240118-170627-marostegui.json
17:06 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2088.codfw.wmnet with OS bullseye
17:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T354336)', diff saved to https://phabricator.wikimedia.org/P54943 and previous config saved to /var/cache/conftool/dbconfig/20240118-170417-marostegui.json
17:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
17:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T354336)', diff saved to https://phabricator.wikimedia.org/P54942 and previous config saved to /var/cache/conftool/dbconfig/20240118-170355-marostegui.json
16:54 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2102.codfw.wmnet with OS bullseye
16:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2101.codfw.wmnet with OS bullseye
16:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P54941 and previous config saved to /var/cache/conftool/dbconfig/20240118-164848-marostegui.json
16:42 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2100.codfw.wmnet with OS bullseye
16:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2090.codfw.wmnet with OS bullseye
16:35 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2099.codfw.wmnet with OS bullseye
16:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P54940 and previous config saved to /var/cache/conftool/dbconfig/20240118-163342-marostegui.json
16:33 hashar@deploy2002: Finished deploy [integration/docroot@1d9323f]: Remove Wikimedia Design Style Guide from the list - T347895 (duration: 00m 07s)
16:33 hashar@deploy2002: Started deploy [integration/docroot@1d9323f]: Remove Wikimedia Design Style Guide from the list - T347895
16:27 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2098.codfw.wmnet with OS bullseye
16:25 sukhe: running authdns-update for T355308
16:22 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2097.codfw.wmnet with OS bullseye
16:18 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
16:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T354336)', diff saved to https://phabricator.wikimedia.org/P54939 and previous config saved to /var/cache/conftool/dbconfig/20240118-161834-marostegui.json
16:18 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2096.codfw.wmnet with OS bullseye
16:18 claime: Running puppet on 'P{P:kubernetes::node} and not P{F:lldp.parent ~ lsw}' - T352883
16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T354336)', diff saved to https://phabricator.wikimedia.org/P54938 and previous config saved to /var/cache/conftool/dbconfig/20240118-161624-marostegui.json
16:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
16:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T354336)', diff saved to https://phabricator.wikimedia.org/P54937 and previous config saved to /var/cache/conftool/dbconfig/20240118-161602-marostegui.json
16:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
16:15 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2095.codfw.wmnet with OS bullseye
16:12 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2094.codfw.wmnet with OS bullseye
16:09 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2093.codfw.wmnet with OS bullseye
16:06 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2092.codfw.wmnet with OS bullseye
16:06 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: moving lvs2011 network link T352912
16:06 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: moving lvs2011 network link T352912
16:06 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cr2-codfw,cr[1-2]-codfw IPv6,re0.cr1-codfw.mgmt,re0.cr2-codfw.mgmt cr1-codfw with reason: moving lvs2011 network link T352912
16:05 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr2-codfw,cr[1-2]-codfw IPv6,re0.cr1-codfw.mgmt,re0.cr2-codfw.mgmt cr1-codfw with reason: moving lvs2011 network link T352912
16:04 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: moving lvs2011 network link T352912
16:04 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2011.codfw.wmnet with reason: moving lvs2011 network link T352912
16:04 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2091.codfw.wmnet with OS bullseye
16:03 claime: Running puppet on 'P{P:kubernetes::node} and P{F:lldp.parent ~ lsw}' - T352883
16:02 topranks: disabling PyBal and puppet on lvs2011, moving traffic to lvs2014 ahead of network change T352912
16:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P54936 and previous config saved to /var/cache/conftool/dbconfig/20240118-160055-marostegui.json
15:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1461.eqiad.wmnet with OS bullseye
15:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2090.codfw.wmnet with OS bullseye
15:56 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1439.eqiad.wmnet with OS bullseye
15:54 claime: Running puppet on A:wikikube-staging-worker - T352883
15:53 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1469.eqiad.wmnet with OS bullseye
15:52 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1045.eqiad.wmnet
15:52 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2045.codfw.wmnet
15:52 claime: Running puppet on kubestage2002 - T352883
15:52 claime: stopping puppet on P:kubernetes::node to deploy 980927 - T352883
15:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2089.codfw.wmnet with OS bullseye
15:49 claime: Running puppet on kubestage2002 - T352893
15:46 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1045.eqiad.wmnet
15:46 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2045.codfw.wmnet
15:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P54935 and previous config saved to /var/cache/conftool/dbconfig/20240118-154549-marostegui.json
15:45 claime: stopping puppet on P:kubernetes::node to deploy 980927 - T352893
15:45 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2088.codfw.wmnet with OS bullseye
15:40 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1461.eqiad.wmnet with reason: host reimage
15:37 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1439.eqiad.wmnet with reason: host reimage
15:35 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1469.eqiad.wmnet with reason: host reimage
15:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1461.eqiad.wmnet with reason: host reimage
15:32 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1439.eqiad.wmnet with reason: host reimage
15:31 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1469.eqiad.wmnet with reason: host reimage
15:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T354336)', diff saved to https://phabricator.wikimedia.org/P54933 and previous config saved to /var/cache/conftool/dbconfig/20240118-153042-marostegui.json
15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T354336)', diff saved to https://phabricator.wikimedia.org/P54932 and previous config saved to /var/cache/conftool/dbconfig/20240118-152832-marostegui.json
15:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
15:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
15:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P54931 and previous config saved to /var/cache/conftool/dbconfig/20240118-152747-marostegui.json
15:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: T355313', diff saved to https://phabricator.wikimedia.org/P54930 and previous config saved to /var/cache/conftool/dbconfig/20240118-152006-root.json
15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1439.eqiad.wmnet with OS bullseye
15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1469.eqiad.wmnet with OS bullseye
15:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1461.eqiad.wmnet with OS bullseye
15:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P54929 and previous config saved to /var/cache/conftool/dbconfig/20240118-151241-marostegui.json
15:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: T355313', diff saved to https://phabricator.wikimedia.org/P54928 and previous config saved to /var/cache/conftool/dbconfig/20240118-150501-root.json
14:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P54927 and previous config saved to /var/cache/conftool/dbconfig/20240118-145734-marostegui.json
14:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: T355313', diff saved to https://phabricator.wikimedia.org/P54926 and previous config saved to /var/cache/conftool/dbconfig/20240118-144956-root.json
14:43 Dreamy_Jazz: Afternoon UTC backport window done
14:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P54925 and previous config saved to /var/cache/conftool/dbconfig/20240118-144228-marostegui.json
14:42 Emperor: disable puppet on ms-be2072 to try and deal with faulty drive
14:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170:3317 (T354336)', diff saved to https://phabricator.wikimedia.org/P54924 and previous config saved to /var/cache/conftool/dbconfig/20240118-144214-marostegui.json
14:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
14:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54923 and previous config saved to /var/cache/conftool/dbconfig/20240118-144152-marostegui.json
14:41 Dreamy_Jazz: Ran `echo 'https://en.wikipedia.org/static/images/mobile/copyright/wikipedia-tagline-th.svg' | mwscript purgeList.php`, `echo 'https://en.wikipedia.org/static/images/mobile/copyright/wikipedia-wordmark-th.svg' | mwscript purgeList.php`, `echo 'https://en.wikipedia.org/static/images/project-logos/thwiki.png' | mwscript purgeList.php`, `echo 'https://en.wikipedia.org/static/images/project-logos/thwiki-1.5x.png' | mwscript purgeList.php`, and `echo 'https://en.wikipedia.org/static/images/project-logos/thwiki-2x.png' | mwscript purgeList.php`
14:38 dreamyjazz@deploy2002: Finished scap: Backport for thwiki: update tagline and optimise other logos (T341407) (duration: 08m 28s)
14:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
14:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
14:34 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: T355313', diff saved to https://phabricator.wikimedia.org/P54922 and previous config saved to /var/cache/conftool/dbconfig/20240118-143451-root.json
14:33 dreamyjazz@deploy2002: anzx and dreamyjazz: Continuing with sync
14:31 dreamyjazz@deploy2002: anzx and dreamyjazz: Backport for thwiki: update tagline and optimise other logos (T341407) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:30 dreamyjazz@deploy2002: Started scap: Backport for thwiki: update tagline and optimise other logos (T341407)
14:28 kartik@deploy2002: Finished scap: Backport for Set MT threshold for Punjabi Wikipedia to 97 (T347789) (duration: 10m 03s)
14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P54921 and previous config saved to /var/cache/conftool/dbconfig/20240118-142646-marostegui.json
14:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: aqs
14:22 kartik@deploy2002: kartik: Continuing with sync
14:19 kartik@deploy2002: kartik: Backport for Set MT threshold for Punjabi Wikipedia to 97 (T347789) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:19 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: T355313', diff saved to https://phabricator.wikimedia.org/P54920 and previous config saved to /var/cache/conftool/dbconfig/20240118-141946-root.json
14:18 kartik@deploy2002: Started scap: Backport for Set MT threshold for Punjabi Wikipedia to 97 (T347789)
14:12 Dreamy_Jazz: running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt`
14:11 dreamyjazz@deploy2002: Finished scap: Backport for Remove RENDER_NOW from File::transform call to avoid job thumbnailing (T355309) (duration: 07m 50s)
14:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P54919 and previous config saved to /var/cache/conftool/dbconfig/20240118-141139-marostegui.json
14:07 Dreamy_Jazz: Stopped MediaModeration scan for commonswiki
14:07 Dreamy_Jazz: stopped MediaModerations scan for group2
14:06 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: aqs
14:06 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
14:05 dreamyjazz@deploy2002: dreamyjazz: Backport for Remove RENDER_NOW from File::transform call to avoid job thumbnailing (T355309) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:04 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 5%: T355313', diff saved to https://phabricator.wikimedia.org/P54918 and previous config saved to /var/cache/conftool/dbconfig/20240118-140441-root.json
14:03 dreamyjazz@deploy2002: Started scap: Backport for Remove RENDER_NOW from File::transform call to avoid job thumbnailing (T355309)
13:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54917 and previous config saved to /var/cache/conftool/dbconfig/20240118-135633-marostegui.json
13:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54916 and previous config saved to /var/cache/conftool/dbconfig/20240118-135422-marostegui.json
13:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
13:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
13:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2107.codfw.wmnet with reason: Maintenance
13:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2107.codfw.wmnet with reason: Maintenance
13:49 marostegui@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 1%: T355313', diff saved to https://phabricator.wikimedia.org/P54915 and previous config saved to /var/cache/conftool/dbconfig/20240118-134936-root.json
13:28 moritzm: installing python-requests security updates
13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T354336)', diff saved to https://phabricator.wikimedia.org/P54914 and previous config saved to /var/cache/conftool/dbconfig/20240118-130451-marostegui.json
12:54 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
12:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1221 (T352010)', diff saved to https://phabricator.wikimedia.org/P54913 and previous config saved to /var/cache/conftool/dbconfig/20240118-125130-ladsgroup.json
12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
12:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
12:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
12:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
12:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P54912 and previous config saved to /var/cache/conftool/dbconfig/20240118-125048-ladsgroup.json
12:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P54911 and previous config saved to /var/cache/conftool/dbconfig/20240118-124945-marostegui.json
12:41 godog: grafana restarted on grafana1002 after https://gerrit.wikimedia.org/r/c/operations/puppet/+/991573
12:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P54910 and previous config saved to /var/cache/conftool/dbconfig/20240118-123541-ladsgroup.json
12:35 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
12:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P54909 and previous config saved to /var/cache/conftool/dbconfig/20240118-123439-marostegui.json
12:34 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
12:33 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
12:31 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
12:28 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
12:27 Dreamy_Jazz: Finished security deploy for T347742
12:27 dreamyjazz@deploy2002: Finished scap: Backport for SECURITY: Use message label instead of sanitized text output for massmessage-form-page-help message (T347742) (duration: 08m 28s)
12:27 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1047.eqiad.wmnet
12:26 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
12:24 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2047.codfw.wmnet
12:21 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
12:20 dreamyjazz@deploy2002: dreamyjazz: Backport for SECURITY: Use message label instead of sanitized text output for massmessage-form-page-help message (T347742) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
12:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P54908 and previous config saved to /var/cache/conftool/dbconfig/20240118-122035-ladsgroup.json
12:20 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2047.codfw.wmnet
12:20 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1047.eqiad.wmnet
12:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T354336)', diff saved to https://phabricator.wikimedia.org/P54907 and previous config saved to /var/cache/conftool/dbconfig/20240118-121932-marostegui.json
12:18 dreamyjazz@deploy2002: Started scap: Backport for SECURITY: Use message label instead of sanitized text output for massmessage-form-page-help message (T347742)
12:17 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
12:17 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
12:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
12:16 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
12:16 jynus: depooled db2146, lot of lag, should be investigated later
12:15 jynus@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P54906 and previous config saved to /var/cache/conftool/dbconfig/20240118-121541-jynus.json
12:07 Dreamy_Jazz: Doing security deploy for T347742
12:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P54905 and previous config saved to /var/cache/conftool/dbconfig/20240118-120528-ladsgroup.json
11:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T354336)', diff saved to https://phabricator.wikimedia.org/P54904 and previous config saved to /var/cache/conftool/dbconfig/20240118-114551-marostegui.json
11:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2189.codfw.wmnet with reason: Maintenance
11:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2189.codfw.wmnet with reason: Maintenance
11:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T354336)', diff saved to https://phabricator.wikimedia.org/P54903 and previous config saved to /var/cache/conftool/dbconfig/20240118-114528-marostegui.json
11:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P54902 and previous config saved to /var/cache/conftool/dbconfig/20240118-113022-marostegui.json
11:21 godog: bounce apache2 on logstash1025 / logstash1031 - T337818
11:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P54901 and previous config saved to /var/cache/conftool/dbconfig/20240118-111516-marostegui.json
11:04 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
11:01 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
11:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T354336)', diff saved to https://phabricator.wikimedia.org/P54900 and previous config saved to /var/cache/conftool/dbconfig/20240118-110009-marostegui.json
10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T354336)', diff saved to https://phabricator.wikimedia.org/P54899 and previous config saved to /var/cache/conftool/dbconfig/20240118-104335-marostegui.json
10:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2175.codfw.wmnet with reason: Maintenance
10:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2175.codfw.wmnet with reason: Maintenance
10:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54898 and previous config saved to /var/cache/conftool/dbconfig/20240118-104313-marostegui.json
10:37 hashar@deploy2002: Finished deploy [integration/docroot@8f5aa9e]: Add Codex Icons package (duration: 00m 05s)
10:36 hashar@deploy2002: Started deploy [integration/docroot@8f5aa9e]: Add Codex Icons package
10:32 hashar@deploy2002: Finished deploy [integration/docroot@88f6458]: Add npm package link for Codex Design Tokens - T354310 (duration: 00m 07s)
10:32 hashar@deploy2002: Started deploy [integration/docroot@88f6458]: Add npm package link for Codex Design Tokens - T354310
10:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
10:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P54896 and previous config saved to /var/cache/conftool/dbconfig/20240118-102806-marostegui.json
10:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2047.codfw.wmnet
10:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
10:22 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2047.codfw.wmnet
10:19 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1047.eqiad.wmnet
10:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1047.eqiad.wmnet
10:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P54894 and previous config saved to /var/cache/conftool/dbconfig/20240118-101300-marostegui.json
10:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2046.codfw.wmnet
10:09 Dreamy_Jazz: T351400 running on a tmux session `foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --sleep 0 --verbose 2>&1 | tee ~/scan-files-in-scan-table-group2-sleep-0-non-jobqueue.txt`
10:04 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2046.codfw.wmnet
10:01 btullis: built and published updated openjdk-11 images based on: 11.0.21-s0-20240111
09:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54893 and previous config saved to /var/cache/conftool/dbconfig/20240118-095753-marostegui.json
09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54892 and previous config saved to /var/cache/conftool/dbconfig/20240118-095522-marostegui.json
09:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
09:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T354336)', diff saved to https://phabricator.wikimedia.org/P54891 and previous config saved to /var/cache/conftool/dbconfig/20240118-095500-marostegui.json
09:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1046.eqiad.wmnet
09:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P54890 and previous config saved to /var/cache/conftool/dbconfig/20240118-093954-marostegui.json
09:30 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.14 refs T354432
09:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1046.eqiad.wmnet
09:25 godog: add 50G to prometheus@k8s-mlserve in codfw
09:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P54889 and previous config saved to /var/cache/conftool/dbconfig/20240118-092447-marostegui.json
09:15 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --sleep 0 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-0-non-jobqueue.txt`
09:12 Dreamy_Jazz: stopped MediaModeration scanning script
09:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T354336)', diff saved to https://phabricator.wikimedia.org/P54888 and previous config saved to /var/cache/conftool/dbconfig/20240118-090941-marostegui.json
09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T354336)', diff saved to https://phabricator.wikimedia.org/P54887 and previous config saved to /var/cache/conftool/dbconfig/20240118-090712-marostegui.json
09:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2148.codfw.wmnet with reason: Maintenance
09:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2148.codfw.wmnet with reason: Maintenance
09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54886 and previous config saved to /var/cache/conftool/dbconfig/20240118-090649-marostegui.json
08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P54885 and previous config saved to /var/cache/conftool/dbconfig/20240118-085143-marostegui.json
08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P54884 and previous config saved to /var/cache/conftool/dbconfig/20240118-083636-marostegui.json
08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54883 and previous config saved to /var/cache/conftool/dbconfig/20240118-082130-marostegui.json
08:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54882 and previous config saved to /var/cache/conftool/dbconfig/20240118-081900-marostegui.json
08:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
08:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
08:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T354336)', diff saved to https://phabricator.wikimedia.org/P54881 and previous config saved to /var/cache/conftool/dbconfig/20240118-081838-marostegui.json
08:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P54880 and previous config saved to /var/cache/conftool/dbconfig/20240118-080332-marostegui.json
07:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P54879 and previous config saved to /var/cache/conftool/dbconfig/20240118-074825-marostegui.json
07:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T354336)', diff saved to https://phabricator.wikimedia.org/P54878 and previous config saved to /var/cache/conftool/dbconfig/20240118-073319-marostegui.json
07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T354336)', diff saved to https://phabricator.wikimedia.org/P54877 and previous config saved to /var/cache/conftool/dbconfig/20240118-073054-marostegui.json
07:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
07:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
07:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2126.codfw.wmnet with reason: Maintenance
07:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2126.codfw.wmnet with reason: Maintenance
07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T354336)', diff saved to https://phabricator.wikimedia.org/P54876 and previous config saved to /var/cache/conftool/dbconfig/20240118-073016-marostegui.json
07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P54875 and previous config saved to /var/cache/conftool/dbconfig/20240118-071509-marostegui.json
07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P54874 and previous config saved to /var/cache/conftool/dbconfig/20240118-070003-marostegui.json
06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T354336)', diff saved to https://phabricator.wikimedia.org/P54873 and previous config saved to /var/cache/conftool/dbconfig/20240118-064456-marostegui.json
06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T354336)', diff saved to https://phabricator.wikimedia.org/P54872 and previous config saved to /var/cache/conftool/dbconfig/20240118-064225-marostegui.json
06:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2125.codfw.wmnet with reason: Maintenance
06:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2125.codfw.wmnet with reason: Maintenance
06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T354336)', diff saved to https://phabricator.wikimedia.org/P54871 and previous config saved to /var/cache/conftool/dbconfig/20240118-064203-marostegui.json
06:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P54870 and previous config saved to /var/cache/conftool/dbconfig/20240118-062657-marostegui.json
06:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P54869 and previous config saved to /var/cache/conftool/dbconfig/20240118-061150-marostegui.json
06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1199 (T352010)', diff saved to https://phabricator.wikimedia.org/P54868 and previous config saved to /var/cache/conftool/dbconfig/20240118-061138-ladsgroup.json
06:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
06:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P54867 and previous config saved to /var/cache/conftool/dbconfig/20240118-061116-ladsgroup.json
05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T354336)', diff saved to https://phabricator.wikimedia.org/P54866 and previous config saved to /var/cache/conftool/dbconfig/20240118-055643-marostegui.json
05:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P54865 and previous config saved to /var/cache/conftool/dbconfig/20240118-055609-ladsgroup.json
05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2104 (T354336)', diff saved to https://phabricator.wikimedia.org/P54864 and previous config saved to /var/cache/conftool/dbconfig/20240118-055419-marostegui.json
05:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2104.codfw.wmnet with reason: Maintenance
05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2104.codfw.wmnet with reason: Maintenance
05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
05:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
05:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1162.eqiad.wmnet with reason: Maintenance
05:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1162.eqiad.wmnet with reason: Maintenance
05:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P54863 and previous config saved to /var/cache/conftool/dbconfig/20240118-054103-ladsgroup.json
05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P54862 and previous config saved to /var/cache/conftool/dbconfig/20240118-052556-ladsgroup.json

2024-01-17

23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 (T352010)', diff saved to https://phabricator.wikimedia.org/P54861 and previous config saved to /var/cache/conftool/dbconfig/20240117-233655-ladsgroup.json
23:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
23:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
22:01 inflatador: bking@kafka-main2001 `kafka topics --alter --topic eqiad.cirrussearch.update_pipeline.fetch_error.rc0 --partitions 5` T354595
21:55 catrope@deploy2002: Finished scap: Backport for Fix text overflow in history page (T354218) (duration: 09m 39s)
21:50 inflatador: bking@kafka-main2001 `kafka topics --alter --topic codfw.cirrussearch.update_pipeline.fetch_error.rc0 --partitions 5` T354595
21:49 catrope@deploy2002: jdlrobson and catrope: Continuing with sync
21:47 catrope@deploy2002: jdlrobson and catrope: Backport for Fix text overflow in history page (T354218) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:47 inflatador: bking@kafka-main2001 `kafka topics --alter --topic eqiad.cirrussearch.update_pipeline.update.rc0 --partitions 5` T354595
21:45 catrope@deploy2002: Started scap: Backport for Fix text overflow in history page (T354218)
21:43 catrope@deploy2002: Finished scap: Backport for Enable desktop history page for all mobile logged in users (T353388) (duration: 15m 15s)
21:37 catrope@deploy2002: jdlrobson and catrope: Continuing with sync
21:30 catrope@deploy2002: jdlrobson and catrope: Backport for Enable desktop history page for all mobile logged in users (T353388) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:28 catrope@deploy2002: Started scap: Backport for Enable desktop history page for all mobile logged in users (T353388)
21:16 inflatador: bking@kafka-main1001 `kafka topics --alter --topic codfw.cirrussearch.update_pipeline.fetch_error.rc0 --partitions 5
21:15 inflatador: bking@kafka-main1001 `kafka topics --alter --topic eqiad.cirrussearch.update_pipeline.update.rc0 --partitions 5` T354595
21:13 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
21:13 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
21:13 inflatador: bking@kafka-main1001 `kafka topics --alter --topic codfw.cirrussearch.update_pipeline.update.rc0 --partitions 5`
21:07 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
21:07 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
21:06 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
21:06 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
21:05 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
21:04 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
20:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
20:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
20:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T354336)', diff saved to https://phabricator.wikimedia.org/P54860 and previous config saved to /var/cache/conftool/dbconfig/20240117-201513-marostegui.json
20:05 mutante: LDAP - added uid=dimakoushha to groups wmde and nda (T354276)
20:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P54859 and previous config saved to /var/cache/conftool/dbconfig/20240117-200006-marostegui.json
19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P54858 and previous config saved to /var/cache/conftool/dbconfig/20240117-194500-marostegui.json
19:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T354336)', diff saved to https://phabricator.wikimedia.org/P54857 and previous config saved to /var/cache/conftool/dbconfig/20240117-192953-marostegui.json
19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T354336)', diff saved to https://phabricator.wikimedia.org/P54856 and previous config saved to /var/cache/conftool/dbconfig/20240117-192737-marostegui.json
19:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1233.eqiad.wmnet with reason: Maintenance
19:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1233.eqiad.wmnet with reason: Maintenance
19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T354336)', diff saved to https://phabricator.wikimedia.org/P54855 and previous config saved to /var/cache/conftool/dbconfig/20240117-192715-marostegui.json
19:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P54854 and previous config saved to /var/cache/conftool/dbconfig/20240117-191209-marostegui.json
19:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
19:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
18:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P54853 and previous config saved to /var/cache/conftool/dbconfig/20240117-185703-marostegui.json
18:54 jnuche@deploy2002: Finished scap: deploying K8s config changes from T355243 (duration: 01m 42s)
18:52 jnuche@deploy2002: Started scap: deploying K8s config changes from T355243
18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T354336)', diff saved to https://phabricator.wikimedia.org/P54852 and previous config saved to /var/cache/conftool/dbconfig/20240117-184156-marostegui.json
18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T354336)', diff saved to https://phabricator.wikimedia.org/P54851 and previous config saved to /var/cache/conftool/dbconfig/20240117-183944-marostegui.json
18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1229.eqiad.wmnet with reason: Maintenance
18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1229.eqiad.wmnet with reason: Maintenance
18:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
18:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T354336)', diff saved to https://phabricator.wikimedia.org/P54850 and previous config saved to /var/cache/conftool/dbconfig/20240117-183857-marostegui.json
18:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P54849 and previous config saved to /var/cache/conftool/dbconfig/20240117-182351-marostegui.json
18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P54848 and previous config saved to /var/cache/conftool/dbconfig/20240117-180844-marostegui.json
17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T354336)', diff saved to https://phabricator.wikimedia.org/P54847 and previous config saved to /var/cache/conftool/dbconfig/20240117-175338-marostegui.json
17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1222 (T354336)', diff saved to https://phabricator.wikimedia.org/P54846 and previous config saved to /var/cache/conftool/dbconfig/20240117-175120-marostegui.json
17:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
17:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
17:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T354336)', diff saved to https://phabricator.wikimedia.org/P54845 and previous config saved to /var/cache/conftool/dbconfig/20240117-175059-marostegui.json
17:39 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2395.codfw.wmnet with OS bullseye
17:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P54844 and previous config saved to /var/cache/conftool/dbconfig/20240117-173552-marostegui.json
17:29 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2357.codfw.wmnet with OS bullseye
17:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P54843 and previous config saved to /var/cache/conftool/dbconfig/20240117-172045-marostegui.json
17:19 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
17:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
17:19 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host grafana2001.codfw.wmnet with OS bookworm
17:18 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
17:16 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2395.codfw.wmnet with reason: host reimage
17:13 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
17:11 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
17:08 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
17:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T354336)', diff saved to https://phabricator.wikimedia.org/P54842 and previous config saved to /var/cache/conftool/dbconfig/20240117-170539-marostegui.json
17:05 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2357.codfw.wmnet with reason: host reimage
17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T354336)', diff saved to https://phabricator.wikimedia.org/P54841 and previous config saved to /var/cache/conftool/dbconfig/20240117-170327-marostegui.json
17:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1197.eqiad.wmnet with reason: Maintenance
17:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1197.eqiad.wmnet with reason: Maintenance
17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T354336)', diff saved to https://phabricator.wikimedia.org/P54840 and previous config saved to /var/cache/conftool/dbconfig/20240117-170305-marostegui.json
17:02 denisse@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on grafana2001.codfw.wmnet with reason: host reimage
17:00 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2395.codfw.wmnet with OS bullseye
16:57 denisse@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on grafana2001.codfw.wmnet with reason: host reimage
16:48 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2357.codfw.wmnet with OS bullseye
16:48 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
16:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P54839 and previous config saved to /var/cache/conftool/dbconfig/20240117-164759-marostegui.json
16:42 denisse@cumin2002: START - Cookbook sre.hosts.reimage for host grafana2001.codfw.wmnet with OS bookworm
16:41 jforrester@deploy2002: Finished deploy [integration/docroot@f08a107]: I746134 for T354310 (duration: 00m 07s)
16:40 jforrester@deploy2002: Started deploy [integration/docroot@f08a107]: I746134 for T354310
16:39 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
16:39 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P54838 and previous config saved to /var/cache/conftool/dbconfig/20240117-163252-marostegui.json
16:29 damilare: civicrm upgraded from 5ef5362f to d8b0c977
16:25 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
16:23 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
16:23 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
16:22 kamila@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
16:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T354336)', diff saved to https://phabricator.wikimedia.org/P54837 and previous config saved to /var/cache/conftool/dbconfig/20240117-161746-marostegui.json
16:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T354336)', diff saved to https://phabricator.wikimedia.org/P54836 and previous config saved to /var/cache/conftool/dbconfig/20240117-161534-marostegui.json
16:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1188.eqiad.wmnet with reason: Maintenance
16:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1188.eqiad.wmnet with reason: Maintenance
16:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T354336)', diff saved to https://phabricator.wikimedia.org/P54835 and previous config saved to /var/cache/conftool/dbconfig/20240117-161512-marostegui.json
16:14 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
16:13 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
16:13 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
16:13 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
16:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P54834 and previous config saved to /var/cache/conftool/dbconfig/20240117-160005-marostegui.json
15:54 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
15:54 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
15:54 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
15:54 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Testing new version of Superset
15:49 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
15:49 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
15:45 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
15:45 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P54833 and previous config saved to /var/cache/conftool/dbconfig/20240117-154459-marostegui.json
15:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2045.codfw.wmnet
15:38 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
15:38 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2045.codfw.wmnet
15:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T354336)', diff saved to https://phabricator.wikimedia.org/P54832 and previous config saved to /var/cache/conftool/dbconfig/20240117-152953-marostegui.json
15:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1045.eqiad.wmnet
15:27 taavi: restart etherpad-lite.service on etherpad1003
15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T354336)', diff saved to https://phabricator.wikimedia.org/P54831 and previous config saved to /var/cache/conftool/dbconfig/20240117-152737-marostegui.json
15:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1182.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1182.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54830 and previous config saved to /var/cache/conftool/dbconfig/20240117-152715-marostegui.json
15:23 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1045.eqiad.wmnet
15:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: cache::text
15:15 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
15:13 hnowlan@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mw2282.codfw.wmnet with OS bullseye
15:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P54827 and previous config saved to /var/cache/conftool/dbconfig/20240117-151208-marostegui.json
15:10 Lucas_WMDE: UTC afternoon backport+config window done
15:09 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Exclude qqq from monolingual text languages (T341409) (duration: 07m 59s)
15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
15:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1044.eqiad.wmnet
15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
15:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
15:05 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2044.codfw.wmnet
15:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
15:03 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
15:02 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for Exclude qqq from monolingual text languages (T341409) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:01 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Exclude qqq from monolingual text languages (T341409)
14:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1044.eqiad.wmnet
14:59 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2044.codfw.wmnet
14:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P54826 and previous config saved to /var/cache/conftool/dbconfig/20240117-145702-marostegui.json
14:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: cache::text
14:51 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Skip tainted references test:distnodiff script to fix Wikibase CI (T354881), Only build result entries for used wbsearchentities results (T355053) (duration: 08m 28s)
14:49 claime: restarted rsyslog on kubernetes2048
14:45 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
14:44 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for Skip tainted references test:distnodiff script to fix Wikibase CI (T354881), Only build result entries for used wbsearchentities results (T355053) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:43 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Skip tainted references test:distnodiff script to fix Wikibase CI (T354881), Only build result entries for used wbsearchentities results (T355053)
14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54824 and previous config saved to /var/cache/conftool/dbconfig/20240117-144156-marostegui.json
14:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54823 and previous config saved to /var/cache/conftool/dbconfig/20240117-144039-marostegui.json
14:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
14:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
14:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T354336)', diff saved to https://phabricator.wikimedia.org/P54822 and previous config saved to /var/cache/conftool/dbconfig/20240117-144018-marostegui.json
14:26 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
14:25 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Only build result entries for used wbsearchentities results (T355053) (duration: 09m 23s)
14:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P54821 and previous config saved to /var/cache/conftool/dbconfig/20240117-142511-marostegui.json
14:23 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2282.codfw.wmnet with OS bullseye
14:22 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
14:22 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
14:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
14:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
14:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54820 and previous config saved to /var/cache/conftool/dbconfig/20240117-142015-ladsgroup.json
14:19 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
14:17 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for Only build result entries for used wbsearchentities results (T355053) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:16 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Only build result entries for used wbsearchentities results (T355053)
14:16 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
14:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Remove unused $wgExtraLanguageNames['qqq'] assignment (T263441) (duration: 11m 07s)
14:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P54819 and previous config saved to /var/cache/conftool/dbconfig/20240117-141005-marostegui.json
14:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
14:07 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for Remove unused $wgExtraLanguageNames['qqq'] assignment (T263441) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P54818 and previous config saved to /var/cache/conftool/dbconfig/20240117-140509-ladsgroup.json
14:03 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Remove unused $wgExtraLanguageNames['qqq'] assignment (T263441)
13:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T354336)', diff saved to https://phabricator.wikimedia.org/P54817 and previous config saved to /var/cache/conftool/dbconfig/20240117-135459-marostegui.json
13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T354336)', diff saved to https://phabricator.wikimedia.org/P54816 and previous config saved to /var/cache/conftool/dbconfig/20240117-135242-marostegui.json
13:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1156.eqiad.wmnet with reason: Maintenance
13:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1156.eqiad.wmnet with reason: Maintenance
13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54815 and previous config saved to /var/cache/conftool/dbconfig/20240117-135158-marostegui.json
13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P54814 and previous config saved to /var/cache/conftool/dbconfig/20240117-135002-ladsgroup.json
13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P54813 and previous config saved to /var/cache/conftool/dbconfig/20240117-133652-marostegui.json
13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1014.eqiad.wmnet
13:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54812 and previous config saved to /var/cache/conftool/dbconfig/20240117-133456-ladsgroup.json
13:34 damilare: payments-wiki upgraded from 12d8ad5b to e38b24f0
13:32 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
13:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
13:30 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
13:30 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
13:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host snapshot1014.eqiad.wmnet
13:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P54811 and previous config saved to /var/cache/conftool/dbconfig/20240117-132145-marostegui.json
13:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2267.codfw.wmnet with OS bullseye
13:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54810 and previous config saved to /var/cache/conftool/dbconfig/20240117-130639-marostegui.json
13:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1146:3312 (T354336)', diff saved to https://phabricator.wikimedia.org/P54809 and previous config saved to /var/cache/conftool/dbconfig/20240117-130422-marostegui.json
13:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
13:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
13:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
13:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
13:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2113.codfw.wmnet with reason: Maintenance
13:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2113.codfw.wmnet with reason: Maintenance
12:59 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
12:58 taavi: removing vlan1119 interface on lvs1018 T355115
12:56 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2267.codfw.wmnet with reason: host reimage
12:47 taavi: removing vlan1119 interface on lvs1020 T355115
12:38 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2267.codfw.wmnet with OS bullseye
12:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T354336)', diff saved to https://phabricator.wikimedia.org/P54806 and previous config saved to /var/cache/conftool/dbconfig/20240117-122305-marostegui.json
12:22 hnowlan: setting mw[2267,2282,2357,2395] inactive in advance of reimaging
12:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P54805 and previous config saved to /var/cache/conftool/dbconfig/20240117-120758-marostegui.json
12:06 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
12:00 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
12:00 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
12:00 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2394.codfw.wmnet with reason: Bad DIMM
12:00 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2044.codfw.wmnet
12:00 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2394.codfw.wmnet with reason: Bad DIMM
11:59 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=mw2394.codfw.wmnet
11:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2044.codfw.wmnet
11:54 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
11:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P54804 and previous config saved to /var/cache/conftool/dbconfig/20240117-115252-marostegui.json
11:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1044.eqiad.wmnet
11:46 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1044.eqiad.wmnet
11:40 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2044.codfw.wmnet
11:40 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1044.eqiad.wmnet
11:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: memcached
11:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T354336)', diff saved to https://phabricator.wikimedia.org/P54803 and previous config saved to /var/cache/conftool/dbconfig/20240117-113745-marostegui.json
11:34 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: memcached
11:34 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1044.eqiad.wmnet
11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2192 (T354336)', diff saved to https://phabricator.wikimedia.org/P54802 and previous config saved to /var/cache/conftool/dbconfig/20240117-113432-marostegui.json
11:34 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2044.codfw.wmnet
11:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2192.codfw.wmnet with reason: Maintenance
11:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2192.codfw.wmnet with reason: Maintenance
11:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T354336)', diff saved to https://phabricator.wikimedia.org/P54801 and previous config saved to /var/cache/conftool/dbconfig/20240117-113410-marostegui.json
11:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P54800 and previous config saved to /var/cache/conftool/dbconfig/20240117-111904-marostegui.json
11:09 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt`
11:09 Dreamy_Jazz: stopped scanning script
11:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P54799 and previous config saved to /var/cache/conftool/dbconfig/20240117-110357-marostegui.json
10:49 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1043.eqiad.wmnet
10:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T354336)', diff saved to https://phabricator.wikimedia.org/P54798 and previous config saved to /var/cache/conftool/dbconfig/20240117-104851-marostegui.json
10:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T354336)', diff saved to https://phabricator.wikimedia.org/P54797 and previous config saved to /var/cache/conftool/dbconfig/20240117-104438-marostegui.json
10:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
10:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
10:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54796 and previous config saved to /var/cache/conftool/dbconfig/20240117-104416-marostegui.json
10:43 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1043.eqiad.wmnet
10:33 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2043.codfw.wmnet
10:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P54795 and previous config saved to /var/cache/conftool/dbconfig/20240117-102909-marostegui.json
10:26 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2043.codfw.wmnet
10:26 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:26 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2043.codfw.wmnet
10:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P54793 and previous config saved to /var/cache/conftool/dbconfig/20240117-101403-marostegui.json
10:12 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2043.codfw.wmnet
09:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54792 and previous config saved to /var/cache/conftool/dbconfig/20240117-095856-marostegui.json
09:58 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:58 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:58 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54791 and previous config saved to /var/cache/conftool/dbconfig/20240117-095544-marostegui.json
09:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
09:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
09:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T354336)', diff saved to https://phabricator.wikimedia.org/P54790 and previous config saved to /var/cache/conftool/dbconfig/20240117-095521-marostegui.json
09:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1043.eqiad.wmnet
09:51 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2042.codfw.wmnet
09:51 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1042.eqiad.wmnet
09:46 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1043.eqiad.wmnet
09:45 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1042.eqiad.wmnet
09:45 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2042.codfw.wmnet
09:40 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1042.eqiad.wmnet
09:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P54789 and previous config saved to /var/cache/conftool/dbconfig/20240117-094015-marostegui.json
09:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1042.eqiad.wmnet
09:35 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:30 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host mc2042.codfw.wmnet
09:29 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2042.codfw.wmnet
09:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P54788 and previous config saved to /var/cache/conftool/dbconfig/20240117-092507-marostegui.json
09:21 jnuche@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.14 refs T354432 (duration: 06m 15s)
09:15 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.14 refs T354432
09:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T354336)', diff saved to https://phabricator.wikimedia.org/P54787 and previous config saved to /var/cache/conftool/dbconfig/20240117-091000-marostegui.json
09:08 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host mc2042.codfw.wmnet
09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T354336)', diff saved to https://phabricator.wikimedia.org/P54786 and previous config saved to /var/cache/conftool/dbconfig/20240117-090648-marostegui.json
09:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
09:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54785 and previous config saved to /var/cache/conftool/dbconfig/20240117-090626-marostegui.json
09:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2042.codfw.wmnet
08:56 dcausse@deploy2002: Finished scap: Backport for enable page_rerender for all wikis (T351503) (duration: 09m 15s)
08:55 moritzm: installing Python 2.7 security updates
08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P54784 and previous config saved to /var/cache/conftool/dbconfig/20240117-085119-marostegui.json
08:50 dcausse@deploy2002: pfischer and dcausse: Continuing with sync
08:48 dcausse@deploy2002: pfischer and dcausse: Backport for enable page_rerender for all wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:46 dcausse@deploy2002: Started scap: Backport for enable page_rerender for all wikis (T351503)
08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P54783 and previous config saved to /var/cache/conftool/dbconfig/20240117-083613-marostegui.json
08:23 arnaudb@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on db2194.codfw.wmnet with reason: debugging something before T343674
08:22 arnaudb@cumin1001: START - Cookbook sre.hosts.downtime for 20 days, 0:00:00 on db2194.codfw.wmnet with reason: debugging something before T343674
08:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54782 and previous config saved to /var/cache/conftool/dbconfig/20240117-082106-marostegui.json
08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54781 and previous config saved to /var/cache/conftool/dbconfig/20240117-082001-ladsgroup.json
08:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
08:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2137:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54780 and previous config saved to /var/cache/conftool/dbconfig/20240117-081754-marostegui.json
08:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
08:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T354336)', diff saved to https://phabricator.wikimedia.org/P54779 and previous config saved to /var/cache/conftool/dbconfig/20240117-081731-marostegui.json
08:16 moritzm: installing python-git security updates
08:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P54778 and previous config saved to /var/cache/conftool/dbconfig/20240117-080225-marostegui.json
07:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P54777 and previous config saved to /var/cache/conftool/dbconfig/20240117-074719-marostegui.json
07:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T354336)', diff saved to https://phabricator.wikimedia.org/P54776 and previous config saved to /var/cache/conftool/dbconfig/20240117-073212-marostegui.json
07:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T354336)', diff saved to https://phabricator.wikimedia.org/P54775 and previous config saved to /var/cache/conftool/dbconfig/20240117-072902-marostegui.json
07:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
07:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
07:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
07:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
07:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T354336)', diff saved to https://phabricator.wikimedia.org/P54774 and previous config saved to /var/cache/conftool/dbconfig/20240117-072824-marostegui.json
07:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P54773 and previous config saved to /var/cache/conftool/dbconfig/20240117-071317-marostegui.json
06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P54772 and previous config saved to /var/cache/conftool/dbconfig/20240117-065811-marostegui.json
06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T354336)', diff saved to https://phabricator.wikimedia.org/P54771 and previous config saved to /var/cache/conftool/dbconfig/20240117-064304-marostegui.json
06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2123 (T354336)', diff saved to https://phabricator.wikimedia.org/P54770 and previous config saved to /var/cache/conftool/dbconfig/20240117-063951-marostegui.json
06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
06:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T354336)', diff saved to https://phabricator.wikimedia.org/P54769 and previous config saved to /var/cache/conftool/dbconfig/20240117-063929-marostegui.json
06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P54768 and previous config saved to /var/cache/conftool/dbconfig/20240117-062422-marostegui.json
06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P54767 and previous config saved to /var/cache/conftool/dbconfig/20240117-060916-marostegui.json
05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T354336)', diff saved to https://phabricator.wikimedia.org/P54766 and previous config saved to /var/cache/conftool/dbconfig/20240117-055409-marostegui.json
05:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2111 (T354336)', diff saved to https://phabricator.wikimedia.org/P54765 and previous config saved to /var/cache/conftool/dbconfig/20240117-055056-marostegui.json
05:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2111.codfw.wmnet with reason: Maintenance
05:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2111.codfw.wmnet with reason: Maintenance
05:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2101.codfw.wmnet with reason: Maintenance
05:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2101.codfw.wmnet with reason: Maintenance
05:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
05:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
03:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
03:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
03:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54764 and previous config saved to /var/cache/conftool/dbconfig/20240117-033751-ladsgroup.json
03:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P54763 and previous config saved to /var/cache/conftool/dbconfig/20240117-032245-ladsgroup.json
03:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P54762 and previous config saved to /var/cache/conftool/dbconfig/20240117-030738-ladsgroup.json
02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54761 and previous config saved to /var/cache/conftool/dbconfig/20240117-025232-ladsgroup.json
00:03 tstarling@deploy2002: Synchronized wmf-config: T344791 related cleanup (duration: 06m 22s)

2024-01-16

23:55 tstarling@deploy2002: Synchronized wmf-config/CommonSettings.php: Disable wgUseSameSiteLegacyCookies T344791 (duration: 09m 19s)
21:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P54760 and previous config saved to /var/cache/conftool/dbconfig/20240116-214016-ladsgroup.json
21:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
21:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
20:43 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2297.codfw.wmnet with OS bullseye
20:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2296.codfw.wmnet with OS bullseye
20:30 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2295.codfw.wmnet with OS bullseye
20:26 ryankemper: T351650 Running puppet on `P:trafficserver::backend` following merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/991091
20:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2294.codfw.wmnet with OS bullseye
20:23 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2297.codfw.wmnet with reason: host reimage
20:20 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2297.codfw.wmnet with reason: host reimage
20:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2296.codfw.wmnet with reason: host reimage
20:16 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2292.codfw.wmnet with OS bullseye
20:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2293.codfw.wmnet with OS bullseye
20:13 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2296.codfw.wmnet with reason: host reimage
20:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2291.codfw.wmnet with OS bullseye
20:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2295.codfw.wmnet with reason: host reimage
20:08 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2295.codfw.wmnet with reason: host reimage
20:06 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2294.codfw.wmnet with reason: host reimage
20:03 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2297.codfw.wmnet with OS bullseye
20:02 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2294.codfw.wmnet with reason: host reimage
19:56 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2296.codfw.wmnet with OS bullseye
19:56 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2292.codfw.wmnet with reason: host reimage
19:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2293.codfw.wmnet with reason: host reimage
19:52 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2295.codfw.wmnet with OS bullseye
19:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2291.codfw.wmnet with reason: host reimage
19:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1375.eqiad.wmnet with OS bullseye
19:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2293.codfw.wmnet with reason: host reimage
19:48 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2292.codfw.wmnet with reason: host reimage
19:47 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2291.codfw.wmnet with reason: host reimage
19:47 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1376.eqiad.wmnet with OS bullseye
19:46 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2294.codfw.wmnet with OS bullseye
19:45 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1374.eqiad.wmnet with OS bullseye
19:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
19:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T354336)', diff saved to https://phabricator.wikimedia.org/P54759 and previous config saved to /var/cache/conftool/dbconfig/20240116-194509-marostegui.json
19:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1360.eqiad.wmnet with OS bullseye
19:32 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2293.codfw.wmnet with OS bullseye
19:31 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2292.codfw.wmnet with OS bullseye
19:31 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2291.codfw.wmnet with OS bullseye
19:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1363.eqiad.wmnet with OS bullseye
19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P54758 and previous config saved to /var/cache/conftool/dbconfig/20240116-193002-marostegui.json
19:29 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1375.eqiad.wmnet with reason: host reimage
19:29 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1361.eqiad.wmnet with OS bullseye
19:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1362.eqiad.wmnet with OS bullseye
19:27 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1376.eqiad.wmnet with reason: host reimage
19:24 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1374.eqiad.wmnet with reason: host reimage
19:23 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1376.eqiad.wmnet with reason: host reimage
19:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1375.eqiad.wmnet with reason: host reimage
19:21 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1374.eqiad.wmnet with reason: host reimage
19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P54757 and previous config saved to /var/cache/conftool/dbconfig/20240116-191456-marostegui.json
19:13 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1360.eqiad.wmnet with reason: host reimage
19:10 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1363.eqiad.wmnet with reason: host reimage
19:08 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1361.eqiad.wmnet with reason: host reimage
19:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1376.eqiad.wmnet with OS bullseye
19:07 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1362.eqiad.wmnet with reason: host reimage
19:07 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1375.eqiad.wmnet with OS bullseye
19:06 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1374.eqiad.wmnet with OS bullseye
19:06 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1363.eqiad.wmnet with reason: host reimage
19:05 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1362.eqiad.wmnet with reason: host reimage
19:05 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1360.eqiad.wmnet with reason: host reimage
19:04 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1361.eqiad.wmnet with reason: host reimage
18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T354336)', diff saved to https://phabricator.wikimedia.org/P54756 and previous config saved to /var/cache/conftool/dbconfig/20240116-185949-marostegui.json
18:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1230 (T354336)', diff saved to https://phabricator.wikimedia.org/P54755 and previous config saved to /var/cache/conftool/dbconfig/20240116-185723-marostegui.json
18:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1230.eqiad.wmnet with reason: Maintenance
18:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1230.eqiad.wmnet with reason: Maintenance
18:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
18:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1216.eqiad.wmnet with reason: Maintenance
18:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54754 and previous config saved to /var/cache/conftool/dbconfig/20240116-185626-marostegui.json
18:51 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1363.eqiad.wmnet with OS bullseye
18:51 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1362.eqiad.wmnet with OS bullseye
18:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1361.eqiad.wmnet with OS bullseye
18:50 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1360.eqiad.wmnet with OS bullseye
18:42 mutante: phab2002 - pulling repo data from phab1004 by running sync script created by rsync::quickdatacopy after gerrit:990247 T354221
18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P54753 and previous config saved to /var/cache/conftool/dbconfig/20240116-184120-marostegui.json
18:38 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --sleep 1 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-non-job-queue.txt`
18:36 Dreamy_Jazz: stopped tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt`
18:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315', diff saved to https://phabricator.wikimedia.org/P54752 and previous config saved to /var/cache/conftool/dbconfig/20240116-182613-marostegui.json
18:20 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
18:19 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
18:19 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
18:19 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
18:18 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
18:18 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
18:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54751 and previous config saved to /var/cache/conftool/dbconfig/20240116-181107-marostegui.json
18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54750 and previous config saved to /var/cache/conftool/dbconfig/20240116-180841-marostegui.json
18:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
18:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
18:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T354336)', diff saved to https://phabricator.wikimedia.org/P54749 and previous config saved to /var/cache/conftool/dbconfig/20240116-180819-marostegui.json
17:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P54748 and previous config saved to /var/cache/conftool/dbconfig/20240116-175313-marostegui.json
17:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P54747 and previous config saved to /var/cache/conftool/dbconfig/20240116-173806-marostegui.json
17:32 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1460.eqiad.wmnet with OS bullseye
17:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T354336)', diff saved to https://phabricator.wikimedia.org/P54746 and previous config saved to /var/cache/conftool/dbconfig/20240116-172300-marostegui.json
17:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T354336)', diff saved to https://phabricator.wikimedia.org/P54745 and previous config saved to /var/cache/conftool/dbconfig/20240116-172032-marostegui.json
17:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
17:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
17:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T354336)', diff saved to https://phabricator.wikimedia.org/P54744 and previous config saved to /var/cache/conftool/dbconfig/20240116-172011-marostegui.json
17:14 topranks: Disabling puppet and PyBal on lvs2012 ahead of migration of network link to lsw1-b2-codfw T352909
17:12 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1460.eqiad.wmnet with reason: host reimage
17:11 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: moving lvs hosts codfw T352784 T352918
17:11 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: moving lvs hosts codfw T352784 T352918
17:10 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1460.eqiad.wmnet with reason: host reimage
17:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P54743 and previous config saved to /var/cache/conftool/dbconfig/20240116-170503-marostegui.json
16:56 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus1006.eqiad.wmnet with reason: memory upgrade
16:56 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus1006.eqiad.wmnet with reason: memory upgrade
16:56 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw1460.eqiad.wmnet with OS bullseye
16:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P54742 and previous config saved to /var/cache/conftool/dbconfig/20240116-164957-marostegui.json
16:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T354336)', diff saved to https://phabricator.wikimedia.org/P54741 and previous config saved to /var/cache/conftool/dbconfig/20240116-163449-marostegui.json
16:33 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus1005.eqiad.wmnet with reason: memory upgrade
16:33 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus1005.eqiad.wmnet with reason: memory upgrade
16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T354336)', diff saved to https://phabricator.wikimedia.org/P54740 and previous config saved to /var/cache/conftool/dbconfig/20240116-163224-marostegui.json
16:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
16:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T354336)', diff saved to https://phabricator.wikimedia.org/P54739 and previous config saved to /var/cache/conftool/dbconfig/20240116-163203-marostegui.json
16:22 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab1004 for T354969 (duration: 00m 50s)
16:22 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab1004 for T354969
16:21 brennen@deploy2002: Finished deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 for T354969 (duration: 00m 27s)
16:21 brennen@deploy2002: Started deploy [phabricator/deployment@24a2a2a]: deploy to phab2002 for T354969
16:20 mutante: phabricator deploy is imminent
16:20 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: deployment
16:20 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: deployment
16:20 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
16:19 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P54738 and previous config saved to /var/cache/conftool/dbconfig/20240116-161656-marostegui.json
16:03 Dreamy_Jazz: T351400 running on a tmux session `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt`
16:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P54737 and previous config saved to /var/cache/conftool/dbconfig/20240116-160150-marostegui.json
16:00 Dreamy_Jazz: stopped mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt
15:55 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on re0.cr[1-2]-codfw.mgmt with reason: moving lvs hosts codfw T352784 T352918
15:55 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on re0.cr[1-2]-codfw.mgmt with reason: moving lvs hosts codfw T352784 T352918
15:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T354336)', diff saved to https://phabricator.wikimedia.org/P54736 and previous config saved to /var/cache/conftool/dbconfig/20240116-154643-marostegui.json
15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T354336)', diff saved to https://phabricator.wikimedia.org/P54735 and previous config saved to /var/cache/conftool/dbconfig/20240116-154419-marostegui.json
15:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
15:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T354336)', diff saved to https://phabricator.wikimedia.org/P54734 and previous config saved to /var/cache/conftool/dbconfig/20240116-154357-marostegui.json
15:29 Dreamy_Jazz: T351400 running mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30.txt
15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P54733 and previous config saved to /var/cache/conftool/dbconfig/20240116-152850-marostegui.json
15:28 Dreamy_Jazz: stopped mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 25 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-25.txt
15:27 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr[1-2]-codfw,cr[1-2]-codfw IPv6,lvs2013 with reason: moving lvs hosts codfw T352784
15:27 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr[1-2]-codfw,cr[1-2]-codfw IPv6,lvs2013 with reason: moving lvs hosts codfw T352784
15:19 topranks: Disabling puppet and PyBal on lvs2013 ahead of migration of network link to ssw1-a1-codfw T352784
15:18 Dreamy_Jazz: T351400 running mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 25 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-20.txt
15:18 Dreamy_Jazz: Stopped mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 20 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-20.txt
15:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P54732 and previous config saved to /var/cache/conftool/dbconfig/20240116-151344-marostegui.json
15:13 Dreamy_Jazz: T351400 running mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 20 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-20.txt
15:11 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:07 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:00 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:00 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for cloud-support1-c-eqiad - cmooney@cumin1002"
14:58 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for cloud-support1-c-eqiad - cmooney@cumin1002"
14:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T354336)', diff saved to https://phabricator.wikimedia.org/P54731 and previous config saved to /var/cache/conftool/dbconfig/20240116-145837-marostegui.json
14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T354336)', diff saved to https://phabricator.wikimedia.org/P54730 and previous config saved to /var/cache/conftool/dbconfig/20240116-145613-marostegui.json
14:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
14:55 cmooney@cumin1002: START - Cookbook sre.dns.netbox
14:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54729 and previous config saved to /var/cache/conftool/dbconfig/20240116-145458-marostegui.json
14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P54728 and previous config saved to /var/cache/conftool/dbconfig/20240116-143951-marostegui.json
14:33 moritzm: installing ca-certificates-java bugfix updates on bookworm
14:31 Dreamy_Jazz: UTC afternoon deploys done
14:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P54727 and previous config saved to /var/cache/conftool/dbconfig/20240116-142444-marostegui.json
14:24 dreamyjazz@deploy2002: Finished scap: Backport for Add more statsd counters and add logstash logging (T351419) (duration: 07m 15s)
14:18 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
14:18 dreamyjazz@deploy2002: dreamyjazz: Backport for Add more statsd counters and add logstash logging (T351419) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:17 moritzm: installing 5.10.205 kernels on buster hosts running the 5.10 backport
14:16 dreamyjazz@deploy2002: Started scap: Backport for Add more statsd counters and add logstash logging (T351419)
14:14 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2042.codfw.wmnet
14:14 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1041.eqiad.wmnet
14:11 dreamyjazz@deploy2002: Finished scap: Backport for Support parallel PhotoDNA requests (T354408) (duration: 07m 14s)
14:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54726 and previous config saved to /var/cache/conftool/dbconfig/20240116-140938-marostegui.json
14:07 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2042.codfw.wmnet
14:07 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1041.eqiad.wmnet
14:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1144:3315 (T354336)', diff saved to https://phabricator.wikimedia.org/P54725 and previous config saved to /var/cache/conftool/dbconfig/20240116-140713-marostegui.json
14:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
14:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
14:05 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
14:05 dreamyjazz@deploy2002: dreamyjazz: Backport for Support parallel PhotoDNA requests (T354408) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:04 dreamyjazz@deploy2002: Started scap: Backport for Support parallel PhotoDNA requests (T354408)
13:54 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
13:35 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS bullseye
13:18 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
13:15 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
13:09 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
13:09 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
13:08 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
13:08 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
13:06 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
13:05 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
13:02 effie: reimage mc-wf1001 (part of puppet7 migration)
13:01 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS bullseye
12:57 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1040.eqiad.wmnet
12:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2041.codfw.wmnet
12:52 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1040.eqiad.wmnet
12:50 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2041.codfw.wmnet
12:30 moritzm: installing systemd bugfix updates from Bullseye point release
12:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc-wf1001.eqiad.wmnet
12:18 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2040.codfw.wmnet
12:11 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2040.codfw.wmnet
12:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc-wf1001.eqiad.wmnet
11:56 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.14 refs T354432
11:45 jnuche@deploy2002: Finished scap: Backport for PreAuthenticationProvider: Deny account creation based on ipoid data (T354928) (duration: 29m 32s)
11:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2041.codfw.wmnet
11:39 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2041.codfw.wmnet
11:36 jnuche@deploy2002: jnuche and kharlan: Continuing with sync
11:36 jnuche@deploy2002: jnuche and kharlan: Backport for PreAuthenticationProvider: Deny account creation based on ipoid data (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2040.codfw.wmnet
11:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2040.codfw.wmnet
11:23 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1041.eqiad.wmnet
11:19 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2039.codfw.wmnet
11:16 jnuche@deploy2002: Started scap: Backport for PreAuthenticationProvider: Deny account creation based on ipoid data (T354928)
11:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1041.eqiad.wmnet
11:13 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2039.codfw.wmnet
11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1040.eqiad.wmnet
11:08 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1040.eqiad.wmnet
10:59 jnuche@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.14 refs T354432 (duration: 29m 36s)
10:53 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1039.eqiad.wmnet
10:47 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1039.eqiad.wmnet
10:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2039.codfw.wmnet
10:35 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2038.codfw.wmnet
10:30 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2039.codfw.wmnet
10:30 jnuche@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.14 refs T354432
10:29 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc2038.codfw.wmnet
10:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc2038.codfw.wmnet
10:21 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1038.eqiad.wmnet
10:16 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc2038.codfw.wmnet
10:16 godog: clean up also 1.42.0-wmf.9 1.42.0-wmf.10 1.42.0-wmf.12 from mw22* - T355117
10:15 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1038.eqiad.wmnet
10:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1039.eqiad.wmnet
10:10 godog: manually pruning php-1.42.0-wmf.7 from mw22* - T355117
10:07 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1039.eqiad.wmnet
10:06 jnuche@deploy2002: Pruned MediaWiki: 1.42.0-wmf.7, 1.42.0-wmf.9, 1.42.0-wmf.10, 1.42.0-wmf.12 (duration: 07m 08s)
10:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1038.eqiad.wmnet
10:00 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1038.eqiad.wmnet
09:51 jnuche@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.14 refs T354432 (duration: 52m 52s)
09:28 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set cloudvirt2004-dev as active - taavi@cumin1002"
09:26 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set cloudvirt2004-dev as active - taavi@cumin1002"
09:25 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:23 taavi@cumin1002: START - Cookbook sre.dns.netbox
09:05 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Daniram3 out of all services on: 2211 hosts
09:04 denisse: reprepro: Copy grafana v9.4.14 from buster to bookworm - T352665
09:03 denisse: reprepro: Copy grafana v9.4.14 from buster to bookworm
09:03 root@cumin2002: START - Cookbook sre.idm.logout Logging Daniram3 out of all services on: 2211 hosts
08:59 jnuche@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.14 refs T354432

2024-01-15

21:46 reedy@deploy2002: Synchronized wmf-config/: Fix more stringified class names (duration: 06m 29s)
21:37 fab@deploy2002: Finished deploy [airflow-dags/research@9b6a69a]: (no justification provided) (duration: 00m 27s)
21:37 reedy@deploy2002: Synchronized wmf-config/InitialiseSettings.php: Swap stringified class names in ConfirmEdit usages (duration: 06m 30s)
21:36 fab@deploy2002: Started deploy [airflow-dags/research@9b6a69a]: (no justification provided)
21:23 tgr: UTC late deploys done
21:22 tgr@deploy2002: Finished scap: Backport for Log emails in production (duration: 09m 11s)
21:15 tgr@deploy2002: tgr: Continuing with sync
21:14 tgr@deploy2002: tgr: Backport for Log emails in production synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:12 tgr@deploy2002: Started scap: Backport for Log emails in production
19:23 tzatziki: creating the u4c2024_edits table on all wikis
17:55 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons.
17:48 btullis@cumin1002: END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons.
17:23 btullis@cumin1002: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
17:02 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid jvm daemons.
17:00 btullis@cumin1002: START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons.
16:51 btullis@cumin1002: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
16:45 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbstore1005.eqiad.wmnet
16:45 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:45 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
15:26 hnowlan: depooled jobrunner mw1460 to repurpose as k8s node
15:06 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
15:03 btullis@cumin1002: START - Cookbook sre.dns.netbox
14:59 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
14:47 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbstore1005.eqiad.wmnet
14:38 Lucas_WMDE: UTC afternoon backport+config window done
14:33 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for cawiki: update wgAutoConfirmAge and wgAutoConfirmCount (T354425) (duration: 11m 36s)
14:28 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
14:28 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
14:27 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Continuing with sync
14:26 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
14:25 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
14:24 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
14:24 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
14:23 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and anzx: Backport for cawiki: update wgAutoConfirmAge and wgAutoConfirmCount (T354425) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:23 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
14:23 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
14:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for cawiki: update wgAutoConfirmAge and wgAutoConfirmCount (T354425)
13:49 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
13:26 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp2003.codfw.wmnet
13:19 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp2003.codfw.wmnet
13:19 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp2002.codfw.wmnet
13:12 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp2002.codfw.wmnet
13:12 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp2001.codfw.wmnet
13:09 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1003.eqiad.wmnet
13:05 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp2001.codfw.wmnet
13:03 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp1003.eqiad.wmnet
13:00 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbstore1003.eqiad.wmnet
13:00 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:00 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
12:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: mediawiki::memcached::gutter
12:59 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbstore1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
12:54 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: mediawiki::memcached::gutter
12:42 btullis@cumin1002: START - Cookbook sre.dns.netbox
12:39 effie: enable puppet on mc* hosts - - T349619
12:37 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbstore1003.eqiad.wmnet
12:23 effie: stopping puppet on all mediawiki memcached hosts (mc*, mc-gp*), puppet 7 migration in progress - T349619
12:01 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 92 hosts
12:00 btullis@cumin1002: START - Cookbook sre.hosts.remove-downtime for 92 hosts
11:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
11:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
11:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-coord[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
11:10 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-coord[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
11:10 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-master[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
11:10 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-master[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service
11:09 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1037.eqiad.wmnet
11:08 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 8 hosts with reason: Bringing new nameservers into service
11:08 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 8 hosts with reason: Bringing new nameservers into service
11:08 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 97 hosts with reason: Bringing new nameservers into service
11:07 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 97 hosts with reason: Bringing new nameservers into service
11:03 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc1037.eqiad.wmnet
10:58 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1002.eqiad.wmnet
10:51 jiji@cumin1002: START - Cookbook sre.hosts.reboot-single for host mc-gp1002.eqiad.wmnet
10:48 moritzm: installing systemd bugfix updates from Bullseye point release
10:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1037.eqiad.wmnet
10:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc1037.eqiad.wmnet
10:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc-gp1002.eqiad.wmnet
10:02 ladsgroup@deploy2002: Finished scap: Backport for SecurePoll: Adding updated voterlist files (T349263) (duration: 16m 04s)
09:58 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host mc-gp1002.eqiad.wmnet
09:56 ladsgroup@deploy2002: ladsgroup: Continuing with sync
09:48 ladsgroup@deploy2002: ladsgroup: Backport for SecurePoll: Adding updated voterlist files (T349263) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
09:46 ladsgroup@deploy2002: Started scap: Backport for SecurePoll: Adding updated voterlist files (T349263)
09:16 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:16 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:15 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:15 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:15 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:14 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
08:45 filippo@deploy2002: Finished deploy [performance/arc-lamp@67389a0]: (no justification provided) (duration: 00m 05s)
08:45 filippo@deploy2002: Started deploy [performance/arc-lamp@67389a0]: (no justification provided)
08:23 dcausse@deploy2002: Finished scap: Backport for enable page_rerender for 5th batch of wikis (T351503) (duration: 11m 40s)
08:17 dcausse@deploy2002: pfischer and dcausse: Continuing with sync
08:13 dcausse@deploy2002: pfischer and dcausse: Backport for enable page_rerender for 5th batch of wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:12 dcausse@deploy2002: Started scap: Backport for enable page_rerender for 5th batch of wikis (T351503)
04:57 andrewbogott: restarting wikitech-static, oom

2024-01-14

15:47 taavi@deploy2002: Finished scap: Backport for Log IpReputation channel as debug (T354928) (duration: 26m 49s)
15:36 taavi@deploy2002: taavi: Continuing with sync
15:35 taavi@deploy2002: taavi: Backport for Log IpReputation channel as debug (T354928) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:20 taavi@deploy2002: Started scap: Backport for Log IpReputation channel as debug (T354928)
15:01 andrewbogott: manually emptying /srv/mediawiki/images/wikitech/archive on wikitech-static; the maintenance script didn't do it and the host is failing due to a full disk
15:01 andrewbogott: running deleteArchivedFiles.php on wikitech-static

2024-01-12

23:49 dzahn@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Conniecc1 out of all services on: 2213 hosts
23:47 dzahn@cumin1001: START - Cookbook sre.idm.logout Logging Conniecc1 out of all services on: 2213 hosts
22:52 dzahn@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Conniecc1 out of all services on: 2213 hosts
22:51 dzahn@cumin1001: START - Cookbook sre.idm.logout Logging Conniecc1 out of all services on: 2213 hosts
22:29 dzahn@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Conniecc1 out of all services on: 2213 hosts
22:28 dzahn@cumin1001: START - Cookbook sre.idm.logout Logging Conniecc1 out of all services on: 2213 hosts
18:07 mutante: aphlict1002 - systemctl start logrotate
17:18 tchanders@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
17:18 tchanders@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
17:17 tchanders@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
17:16 tchanders@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
17:10 tchanders@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
17:09 tchanders@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
16:52 cgoubert@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
16:52 cgoubert@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
16:51 cgoubert@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
16:51 cgoubert@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
16:20 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
16:20 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
16:20 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
16:19 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
15:46 klausman@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
15:37 klausman@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
15:14 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
15:14 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
14:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2114.codfw.wmnet with reason: Maintenance
14:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2114.codfw.wmnet with reason: Maintenance
14:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T354336)', diff saved to https://phabricator.wikimedia.org/P54714 and previous config saved to /var/cache/conftool/dbconfig/20240112-140423-marostegui.json
13:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P54713 and previous config saved to /var/cache/conftool/dbconfig/20240112-134916-marostegui.json
13:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P54712 and previous config saved to /var/cache/conftool/dbconfig/20240112-133410-marostegui.json
13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T354336)', diff saved to https://phabricator.wikimedia.org/P54711 and previous config saved to /var/cache/conftool/dbconfig/20240112-131904-marostegui.json
12:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T354336)', diff saved to https://phabricator.wikimedia.org/P54710 and previous config saved to /var/cache/conftool/dbconfig/20240112-125944-marostegui.json
12:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2193.codfw.wmnet with reason: Maintenance
12:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2193.codfw.wmnet with reason: Maintenance
12:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54709 and previous config saved to /var/cache/conftool/dbconfig/20240112-125921-marostegui.json
12:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P54708 and previous config saved to /var/cache/conftool/dbconfig/20240112-124416-marostegui.json
12:33 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=dewiki --logwiki=metawiki 'Osip Knecht' 'Artquichotte39'
12:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P54707 and previous config saved to /var/cache/conftool/dbconfig/20240112-122909-marostegui.json
12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54706 and previous config saved to /var/cache/conftool/dbconfig/20240112-121402-marostegui.json
12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54704 and previous config saved to /var/cache/conftool/dbconfig/20240112-121150-marostegui.json
12:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2180.codfw.wmnet with reason: Maintenance
12:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2180.codfw.wmnet with reason: Maintenance
12:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54703 and previous config saved to /var/cache/conftool/dbconfig/20240112-121127-marostegui.json
12:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
12:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
12:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
12:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
12:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
12:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
11:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P54701 and previous config saved to /var/cache/conftool/dbconfig/20240112-115621-marostegui.json
11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P54700 and previous config saved to /var/cache/conftool/dbconfig/20240112-114114-marostegui.json
11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54699 and previous config saved to /var/cache/conftool/dbconfig/20240112-112608-marostegui.json
11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54698 and previous config saved to /var/cache/conftool/dbconfig/20240112-112049-marostegui.json
11:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
11:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54697 and previous config saved to /var/cache/conftool/dbconfig/20240112-112027-marostegui.json
11:10 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
11:08 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P54696 and previous config saved to /var/cache/conftool/dbconfig/20240112-110521-marostegui.json
11:04 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P54695 and previous config saved to /var/cache/conftool/dbconfig/20240112-105014-marostegui.json
10:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54694 and previous config saved to /var/cache/conftool/dbconfig/20240112-103508-marostegui.json
10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2169:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54693 and previous config saved to /var/cache/conftool/dbconfig/20240112-103250-marostegui.json
10:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
10:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54692 and previous config saved to /var/cache/conftool/dbconfig/20240112-103227-marostegui.json
10:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P54691 and previous config saved to /var/cache/conftool/dbconfig/20240112-101721-marostegui.json
10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P54690 and previous config saved to /var/cache/conftool/dbconfig/20240112-100214-marostegui.json
09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54689 and previous config saved to /var/cache/conftool/dbconfig/20240112-094708-marostegui.json
09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T354336)', diff saved to https://phabricator.wikimedia.org/P54688 and previous config saved to /var/cache/conftool/dbconfig/20240112-094451-marostegui.json
09:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
09:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2158.codfw.wmnet with reason: Maintenance
09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2158.codfw.wmnet with reason: Maintenance
09:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T354336)', diff saved to https://phabricator.wikimedia.org/P54687 and previous config saved to /var/cache/conftool/dbconfig/20240112-094413-marostegui.json
09:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P54686 and previous config saved to /var/cache/conftool/dbconfig/20240112-092907-marostegui.json
09:25 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
09:25 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
09:17 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
09:16 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
09:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P54685 and previous config saved to /var/cache/conftool/dbconfig/20240112-091400-marostegui.json
09:09 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
08:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T354336)', diff saved to https://phabricator.wikimedia.org/P54684 and previous config saved to /var/cache/conftool/dbconfig/20240112-085854-marostegui.json
08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T354336)', diff saved to https://phabricator.wikimedia.org/P54683 and previous config saved to /var/cache/conftool/dbconfig/20240112-085637-marostegui.json
08:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2151.codfw.wmnet with reason: Maintenance
08:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2151.codfw.wmnet with reason: Maintenance
08:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T354336)', diff saved to https://phabricator.wikimedia.org/P54682 and previous config saved to /var/cache/conftool/dbconfig/20240112-085614-marostegui.json
08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P54681 and previous config saved to /var/cache/conftool/dbconfig/20240112-084108-marostegui.json
08:40 godog: upload and finish upgrade of prometheus 2.48 on all sites - T354399
08:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54680 and previous config saved to /var/cache/conftool/dbconfig/20240112-083837-root.json
08:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P54679 and previous config saved to /var/cache/conftool/dbconfig/20240112-082601-marostegui.json
08:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54678 and previous config saved to /var/cache/conftool/dbconfig/20240112-082332-root.json
08:20 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 3605
08:19 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 3605
08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2129 (T354336)', diff saved to https://phabricator.wikimedia.org/P54677 and previous config saved to /var/cache/conftool/dbconfig/20240112-081055-marostegui.json
08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2129 (T354336)', diff saved to https://phabricator.wikimedia.org/P54676 and previous config saved to /var/cache/conftool/dbconfig/20240112-080837-marostegui.json
08:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance
08:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54675 and previous config saved to /var/cache/conftool/dbconfig/20240112-080827-root.json
08:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance
08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T354336)', diff saved to https://phabricator.wikimedia.org/P54674 and previous config saved to /var/cache/conftool/dbconfig/20240112-080815-marostegui.json
07:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54673 and previous config saved to /var/cache/conftool/dbconfig/20240112-075322-root.json
07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P54672 and previous config saved to /var/cache/conftool/dbconfig/20240112-075309-marostegui.json
07:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54671 and previous config saved to /var/cache/conftool/dbconfig/20240112-073817-root.json
07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P54670 and previous config saved to /var/cache/conftool/dbconfig/20240112-073802-marostegui.json
07:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54669 and previous config saved to /var/cache/conftool/dbconfig/20240112-072312-root.json
07:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T354336)', diff saved to https://phabricator.wikimedia.org/P54668 and previous config saved to /var/cache/conftool/dbconfig/20240112-072255-marostegui.json
07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T354336)', diff saved to https://phabricator.wikimedia.org/P54667 and previous config saved to /var/cache/conftool/dbconfig/20240112-072038-marostegui.json
07:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2124.codfw.wmnet with reason: Maintenance
07:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2124.codfw.wmnet with reason: Maintenance
07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T354336)', diff saved to https://phabricator.wikimedia.org/P54666 and previous config saved to /var/cache/conftool/dbconfig/20240112-072015-marostegui.json
07:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54665 and previous config saved to /var/cache/conftool/dbconfig/20240112-070807-root.json
07:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P54664 and previous config saved to /var/cache/conftool/dbconfig/20240112-070508-marostegui.json
06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1168.eqiad.wmnet with OS bookworm
06:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P54663 and previous config saved to /var/cache/conftool/dbconfig/20240112-065002-marostegui.json
06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
06:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
06:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T354336)', diff saved to https://phabricator.wikimedia.org/P54662 and previous config saved to /var/cache/conftool/dbconfig/20240112-063456-marostegui.json
06:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2117 (T354336)', diff saved to https://phabricator.wikimedia.org/P54661 and previous config saved to /var/cache/conftool/dbconfig/20240112-063239-marostegui.json
06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2117.codfw.wmnet with reason: Maintenance
06:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2117.codfw.wmnet with reason: Maintenance
06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
06:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
06:23 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1168.eqiad.wmnet with OS bookworm
06:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1168 T354506', diff saved to https://phabricator.wikimedia.org/P54660 and previous config saved to /var/cache/conftool/dbconfig/20240112-062137-marostegui.json
06:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1173.eqiad.wmnet with reason: Maintenance
06:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1173.eqiad.wmnet with reason: Maintenance
04:12 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
04:12 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
04:12 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
04:11 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
04:11 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
04:11 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
00:59 mutante: LDAP - added myself to gerritadmin group

2024-01-11

21:36 jan_drewniak: https://phabricator.wikimedia.org/T349337#9454773 running maintenance script to delete unnecessary user preferences.
21:26 jdrewniak@deploy2002: Finished scap: Backport for InitialiseSettings.php: disallow obsolete HTML in signatures (enwiki) (T354013), InitialiseSettings.php: Allow thanking bots (T341388) (duration: 13m 43s)
21:20 jdrewniak@deploy2002: jdrewniak and houseblaster: Continuing with sync
21:14 jdrewniak@deploy2002: jdrewniak and houseblaster: Backport for InitialiseSettings.php: disallow obsolete HTML in signatures (enwiki) (T354013), InitialiseSettings.php: Allow thanking bots (T341388) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:12 jdrewniak@deploy2002: Started scap: Backport for InitialiseSettings.php: disallow obsolete HTML in signatures (enwiki) (T354013), InitialiseSettings.php: Allow thanking bots (T341388)
20:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
20:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
20:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T354336)', diff saved to https://phabricator.wikimedia.org/P54657 and previous config saved to /var/cache/conftool/dbconfig/20240111-205021-marostegui.json
20:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P54656 and previous config saved to /var/cache/conftool/dbconfig/20240111-203514-marostegui.json
20:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P54655 and previous config saved to /var/cache/conftool/dbconfig/20240111-202008-marostegui.json
20:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T354336)', diff saved to https://phabricator.wikimedia.org/P54654 and previous config saved to /var/cache/conftool/dbconfig/20240111-200502-marostegui.json
20:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T354336)', diff saved to https://phabricator.wikimedia.org/P54653 and previous config saved to /var/cache/conftool/dbconfig/20240111-200253-marostegui.json
20:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1231.eqiad.wmnet with reason: Maintenance
20:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1231.eqiad.wmnet with reason: Maintenance
20:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
20:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
20:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T354336)', diff saved to https://phabricator.wikimedia.org/P54652 and previous config saved to /var/cache/conftool/dbconfig/20240111-200209-marostegui.json
20:00 htriedman@deploy2002: Finished deploy [airflow-dags/platform_eng@07f5320]: (no justification provided) (duration: 00m 27s)
20:00 htriedman@deploy2002: Started deploy [airflow-dags/platform_eng@07f5320]: (no justification provided)
19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P54651 and previous config saved to /var/cache/conftool/dbconfig/20240111-194703-marostegui.json
19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P54649 and previous config saved to /var/cache/conftool/dbconfig/20240111-193156-marostegui.json
19:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T354336)', diff saved to https://phabricator.wikimedia.org/P54647 and previous config saved to /var/cache/conftool/dbconfig/20240111-191650-marostegui.json
19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T354336)', diff saved to https://phabricator.wikimedia.org/P54646 and previous config saved to /var/cache/conftool/dbconfig/20240111-191440-marostegui.json
19:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1224.eqiad.wmnet with reason: Maintenance
19:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1224.eqiad.wmnet with reason: Maintenance
19:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54645 and previous config saved to /var/cache/conftool/dbconfig/20240111-191418-marostegui.json
19:11 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.13 refs T350089
19:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
18:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P54644 and previous config saved to /var/cache/conftool/dbconfig/20240111-185912-marostegui.json
18:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316', diff saved to https://phabricator.wikimedia.org/P54643 and previous config saved to /var/cache/conftool/dbconfig/20240111-184405-marostegui.json
18:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54641 and previous config saved to /var/cache/conftool/dbconfig/20240111-182859-marostegui.json
18:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213:3316 (T354336)', diff saved to https://phabricator.wikimedia.org/P54640 and previous config saved to /var/cache/conftool/dbconfig/20240111-182745-marostegui.json
18:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
18:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1213.eqiad.wmnet with reason: Maintenance
18:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T354336)', diff saved to https://phabricator.wikimedia.org/P54639 and previous config saved to /var/cache/conftool/dbconfig/20240111-182723-marostegui.json
18:27 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit primary: gerrit.wikimedia.org) (duration: 00m 07s)
18:27 thcipriani@deploy2002: Started deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit primary: gerrit.wikimedia.org)
18:25 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit2002 only) (duration: 00m 05s)
18:25 thcipriani@deploy2002: Started deploy [gerrit/gerrit@376b3e5]: Remove devsat survey banner in 3.6 (gerrit2002 only)
18:23 thcipriani: deploying gerrit to remove devsat survey (no restart needed)
18:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P54638 and previous config saved to /var/cache/conftool/dbconfig/20240111-181217-marostegui.json
17:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P54637 and previous config saved to /var/cache/conftool/dbconfig/20240111-175710-marostegui.json
17:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T354336)', diff saved to https://phabricator.wikimedia.org/P54636 and previous config saved to /var/cache/conftool/dbconfig/20240111-174204-marostegui.json
17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T354336)', diff saved to https://phabricator.wikimedia.org/P54635 and previous config saved to /var/cache/conftool/dbconfig/20240111-173955-marostegui.json
17:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1201.eqiad.wmnet with reason: Maintenance
17:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1201.eqiad.wmnet with reason: Maintenance
17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T354336)', diff saved to https://phabricator.wikimedia.org/P54634 and previous config saved to /var/cache/conftool/dbconfig/20240111-173933-marostegui.json
17:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P54633 and previous config saved to /var/cache/conftool/dbconfig/20240111-172427-marostegui.json
17:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P54632 and previous config saved to /var/cache/conftool/dbconfig/20240111-170920-marostegui.json
16:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T354336)', diff saved to https://phabricator.wikimedia.org/P54631 and previous config saved to /var/cache/conftool/dbconfig/20240111-165414-marostegui.json
16:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T354336)', diff saved to https://phabricator.wikimedia.org/P54630 and previous config saved to /var/cache/conftool/dbconfig/20240111-165305-marostegui.json
16:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1187.eqiad.wmnet with reason: Maintenance
16:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1187.eqiad.wmnet with reason: Maintenance
16:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54629 and previous config saved to /var/cache/conftool/dbconfig/20240111-165244-marostegui.json
16:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P54628 and previous config saved to /var/cache/conftool/dbconfig/20240111-163738-marostegui.json
16:23 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
16:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
16:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P54626 and previous config saved to /var/cache/conftool/dbconfig/20240111-162231-marostegui.json
16:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54625 and previous config saved to /var/cache/conftool/dbconfig/20240111-160725-marostegui.json
16:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: cache::upload
16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T354336)', diff saved to https://phabricator.wikimedia.org/P54624 and previous config saved to /var/cache/conftool/dbconfig/20240111-160516-marostegui.json
16:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1180.eqiad.wmnet with reason: Maintenance
16:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1180.eqiad.wmnet with reason: Maintenance
16:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T354336)', diff saved to https://phabricator.wikimedia.org/P54623 and previous config saved to /var/cache/conftool/dbconfig/20240111-160454-marostegui.json
15:59 sukhe: restart pybal on lvs4010
15:58 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe
15:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P54622 and previous config saved to /var/cache/conftool/dbconfig/20240111-154947-marostegui.json
15:47 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe
15:41 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: cache::upload
15:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P54621 and previous config saved to /var/cache/conftool/dbconfig/20240111-153441-marostegui.json
15:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T354336)', diff saved to https://phabricator.wikimedia.org/P54620 and previous config saved to /var/cache/conftool/dbconfig/20240111-151934-marostegui.json
15:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T354336)', diff saved to https://phabricator.wikimedia.org/P54619 and previous config saved to /var/cache/conftool/dbconfig/20240111-151724-marostegui.json
15:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1168.eqiad.wmnet with reason: Maintenance
15:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1168.eqiad.wmnet with reason: Maintenance
15:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T354336)', diff saved to https://phabricator.wikimedia.org/P54618 and previous config saved to /var/cache/conftool/dbconfig/20240111-151702-marostegui.json
15:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P54617 and previous config saved to /var/cache/conftool/dbconfig/20240111-150156-marostegui.json
14:51 reedy@deploy2002: Synchronized wmf-config/: T325147 (duration: 06m 43s)
14:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P54616 and previous config saved to /var/cache/conftool/dbconfig/20240111-144649-marostegui.json
14:36 reedy@deploy2002: Synchronized wmf-config/: T344398 (duration: 07m 25s)
14:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T354336)', diff saved to https://phabricator.wikimedia.org/P54615 and previous config saved to /var/cache/conftool/dbconfig/20240111-143143-marostegui.json
14:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T354336)', diff saved to https://phabricator.wikimedia.org/P54614 and previous config saved to /var/cache/conftool/dbconfig/20240111-143034-marostegui.json
14:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
14:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
14:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1165.eqiad.wmnet with reason: Maintenance
14:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1165.eqiad.wmnet with reason: Maintenance
14:26 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
14:25 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
14:25 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
14:25 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
14:24 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
14:24 kamila@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
14:21 reedy@deploy2002: Synchronized wmf-config/InitialiseSettings.php: T205347 (duration: 07m 41s)
14:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54613 and previous config saved to /var/cache/conftool/dbconfig/20240111-141058-root.json
13:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54612 and previous config saved to /var/cache/conftool/dbconfig/20240111-135553-root.json
13:49 hashar@deploy2002: Finished deploy [gerrit/gerrit@af34477]: wm-zuul-status: add SCHEDULED for pending check run - T348959 (duration: 00m 07s)
13:49 hashar@deploy2002: Started deploy [gerrit/gerrit@af34477]: wm-zuul-status: add SCHEDULED for pending check run - T348959
13:41 moritzm: installing xerces-c security updates
13:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54611 and previous config saved to /var/cache/conftool/dbconfig/20240111-134048-root.json
13:29 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
13:29 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
13:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54610 and previous config saved to /var/cache/conftool/dbconfig/20240111-132543-root.json
13:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54609 and previous config saved to /var/cache/conftool/dbconfig/20240111-131038-root.json
12:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54608 and previous config saved to /var/cache/conftool/dbconfig/20240111-125533-root.json
12:47 hashar: Restarting Gerrit to apply config change https://gerrit.wikimedia.org/r/c/operations/puppet/+/989735/ # T206049
12:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54607 and previous config saved to /var/cache/conftool/dbconfig/20240111-124028-root.json
12:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2124.codfw.wmnet with OS bookworm
12:20 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
12:20 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
12:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2124.codfw.wmnet with reason: host reimage
12:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2124.codfw.wmnet with reason: host reimage
12:00 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
12:00 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
11:59 moritzm: installing Python 2.7 security updates on Bullseye
11:50 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2124.codfw.wmnet with OS bookworm
11:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2124 T354506', diff saved to https://phabricator.wikimedia.org/P54606 and previous config saved to /var/cache/conftool/dbconfig/20240111-114930-marostegui.json
11:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54605 and previous config saved to /var/cache/conftool/dbconfig/20240111-111958-root.json
11:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54604 and previous config saved to /var/cache/conftool/dbconfig/20240111-110453-root.json
10:54 moritzm: installing Linux 5.10.205 updates on Bullseye hosts
10:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54603 and previous config saved to /var/cache/conftool/dbconfig/20240111-104948-root.json
10:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54602 and previous config saved to /var/cache/conftool/dbconfig/20240111-103443-root.json
10:31 moritzm: installing exim4 security updates
10:31 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
10:30 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
10:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: druid::public::worker
10:26 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
10:26 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
10:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54601 and previous config saved to /var/cache/conftool/dbconfig/20240111-101938-root.json
10:13 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: druid::public::worker
10:12 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
10:12 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
10:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54600 and previous config saved to /var/cache/conftool/dbconfig/20240111-100433-root.json
10:04 sfaci@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
10:03 sfaci@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
10:03 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
10:00 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
10:00 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
09:58 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
09:53 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
09:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54599 and previous config saved to /var/cache/conftool/dbconfig/20240111-094928-root.json
09:39 hashar: Gerrit back up and operational, now running version 3.6.8
09:33 hashar: Gerrit restarted and its reindexing all changes T309870
09:23 hashar@deploy2002: Finished deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870 (duration: 00m 07s)
09:23 hashar@deploy2002: Started deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870
09:22 hashar@deploy2002: Finished deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870 (duration: 00m 27s)
09:21 hashar@deploy2002: Started deploy [gerrit/gerrit@e099b0b]: Gerrit to version 3.6.8 # T309870
09:21 hashar: Stopping Gerrit
09:10 hashar: gerrit: `ssh -p 29418 gerrit.wikimedia.org gerrit copy-approvals` # T309870
09:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1201.eqiad.wmnet with OS bookworm
08:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
08:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage

2024-01-10

22:29 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-eqiad
22:05 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-eqiad
21:54 herron@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-codfw
21:36 Dreamy_Jazz: UTC late deploys done
21:33 dreamyjazz@deploy2002: Finished scap: Backport for Add comment to clarify which rate limits apply to temporary users (T331576) (duration: 08m 05s)
21:28 herron@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-codfw
21:27 dreamyjazz@deploy2002: dreamyjazz and tchanders: Continuing with sync
21:27 dreamyjazz@deploy2002: dreamyjazz and tchanders: Backport for Add comment to clarify which rate limits apply to temporary users (T331576) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:25 dreamyjazz@deploy2002: Started scap: Backport for Add comment to clarify which rate limits apply to temporary users (T331576)
21:19 taavi@deploy2002: Finished scap: Backport for Disable max width for index namespace (T352162) (duration: 14m 19s)
21:12 taavi@deploy2002: toyofuku and taavi: Continuing with sync
21:08 taavi@deploy2002: toyofuku and taavi: Backport for Disable max width for index namespace (T352162) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:05 taavi@deploy2002: Started scap: Backport for Disable max width for index namespace (T352162)
20:22 sukhe: enable puppet on lvs2013: T352758
19:29 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:29 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for mr1-codfw core links - cmooney@cumin1002"
19:28 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove old records for mr1-codfw core links - cmooney@cumin1002"
19:26 jhuneidi@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.13 refs T350089 (duration: 07m 58s)
19:24 cmooney@cumin1002: START - Cookbook sre.dns.netbox
19:18 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.13 refs T350089
19:00 topranks: disabling OSPF connection from mr1-codfw to codfw core routers T348164
18:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
18:38 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on prometheus2006.codfw.wmnet with reason: memory upgrade
18:37 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on prometheus2006.codfw.wmnet with reason: memory upgrade
18:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
18:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
18:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
18:35 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for prometheus2005.codfw.wmnet
18:35 filippo@cumin1002: START - Cookbook sre.hosts.remove-downtime for prometheus2005.codfw.wmnet
18:24 sukhe: stop pybal on lvs2013: T352758
17:59 filippo@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on prometheus2005.codfw.wmnet with reason: memory upgrade
17:58 filippo@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on prometheus2005.codfw.wmnet with reason: memory upgrade
17:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
17:47 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:46 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
17:44 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:44 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
17:40 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs2014.codfw.wmnet
17:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
17:31 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
17:28 sukhe@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
17:27 sukhe: enable puppet on lvs2014: T352758
17:16 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
17:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1378.eqiad.wmnet with OS bullseye
17:14 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:14 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns for sandbox1-a-codfw irb.2201 gw - cmooney@cumin1002"
17:14 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns for sandbox1-a-codfw irb.2201 gw - cmooney@cumin1002"
17:09 cmooney@cumin1002: START - Cookbook sre.dns.netbox
16:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
16:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
16:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye
16:36 godog: upgrade prometheus on prometheus2006 - T354399
16:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
16:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
16:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
16:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
16:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
16:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
16:25 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw[1379-1383].eqiad.wmnet with reason: testing reboot
16:25 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw[1379-1383].eqiad.wmnet with reason: testing reboot
16:22 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1379.eqiad.wmnet with OS bullseye
16:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
16:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
16:00 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1383.eqiad.wmnet with OS bullseye
15:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1381.eqiad.wmnet with OS bullseye
15:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1382.eqiad.wmnet with OS bullseye
15:57 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
15:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: logging::opensearch::data
15:41 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
15:40 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
15:37 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
15:37 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
15:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
15:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
15:34 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
15:24 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: logging::opensearch::data
15:24 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
15:22 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
15:21 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2013.codfw.wmnet with reason: Decommissioning — T352469
15:21 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1383.eqiad.wmnet with OS bullseye
15:20 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1382.eqiad.wmnet with OS bullseye
15:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: logging::opensearch::collector
15:19 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1381.eqiad.wmnet with OS bullseye
15:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1380.eqiad.wmnet with OS bullseye
15:14 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
15:13 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-master[1003-1004].eqiad.wmnet with reason: Bringing new nameservers into service
15:13 klausman@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-staging2001.codfw.wmnet
15:12 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-master[1003-1004].eqiad.wmnet with reason: Bringing new nameservers into service
15:07 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbproxy[1018-1019].eqiad.wmnet
15:06 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:06 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbproxy[1018-1019].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
15:04 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on lvs2014.codfw.wmnet with reason: T352758
15:04 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on lvs2014.codfw.wmnet with reason: T352758
15:03 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dbproxy[1018-1019].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
15:01 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
15:01 sukhe: disable puppet and stop pybal on lvs2014: T352758
15:00 taavi@cumin1002: START - Cookbook sre.dns.netbox
14:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
14:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: logging::opensearch::collector
14:54 topranks: adding vlans to ssw1-a8-codfw to trunk to lvs2014 T352758
14:52 taavi@cumin1002: START - Cookbook sre.hosts.decommission for hosts dbproxy[1018-1019].eqiad.wmnet
14:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
14:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: lvs::balancer
14:39 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
14:39 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
14:38 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1380.eqiad.wmnet with OS bullseye
14:27 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: lvs::balancer
14:27 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
14:27 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
14:26 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
14:26 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
14:25 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
14:24 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
14:22 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
14:21 moritzm: installing lapack bugfix updates
14:21 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
14:04 moritzm: installing openblas bugfix updates
14:03 hashar: Switching operations-puppet-tests-buster-docker Jenkins job from tox v3 to tox v4 | T345152
13:56 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
13:56 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
13:54 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
13:54 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
13:15 godog: test prometheus 2.48.1 on prometheus1005 - T354399
12:48 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.roll-restart-workers (exit_code=99) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
12:47 stevemunene@cumin1002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
12:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts druid1006.eqiad.wmnet
12:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:39 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1006.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
12:37 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1006.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
12:37 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
12:37 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
12:37 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
12:37 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
12:35 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
12:22 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts druid1006.eqiad.wmnet
12:21 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts druid1005.eqiad.wmnet
12:21 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:21 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
12:20 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
12:18 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
12:05 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts druid1005.eqiad.wmnet
11:56 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts druid1004.eqiad.wmnet
11:56 stevemunene@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:56 stevemunene@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
11:54 stevemunene@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: druid1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - stevemunene@cumin1002"
11:51 stevemunene@cumin1002: START - Cookbook sre.dns.netbox
11:47 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
11:46 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
11:46 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
11:46 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
11:46 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
11:46 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
11:43 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
11:43 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
11:43 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
11:41 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
11:41 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
11:41 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
11:39 stevemunene@cumin1002: START - Cookbook sre.hosts.decommission for hosts druid1004.eqiad.wmnet
11:37 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
11:37 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
11:36 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
11:36 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
11:03 moritzm: installing PHP 7.3 security updates
10:46 moritzm: installing curl security updates
10:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testreduce1001.eqiad.wmnet
10:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testreduce1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
10:02 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testreduce1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
10:01 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: sync
10:00 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: sync
10:00 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: sync
10:00 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: sync
09:57 jmm@cumin2002: START - Cookbook sre.dns.netbox
09:55 hashar@deploy2002: Finished deploy [integration/docroot@355ddbb]: (no justification provided) (duration: 00m 04s)
09:55 hashar@deploy2002: Started deploy [integration/docroot@355ddbb]: (no justification provided)
09:55 moritzm: installing git security updates on deployment hosts
09:53 hashar@deploy2002: Finished deploy [integration/docroot@355ddbb]: Dummy deploy to test git safe.directory # T335354 (duration: 00m 06s)
09:53 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testreduce1001.eqiad.wmnet
09:53 hashar@deploy2002: Started deploy [integration/docroot@355ddbb]: Dummy deploy to test git safe.directory # T335354
09:38 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
09:38 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
09:38 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
09:38 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1378.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
09:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 15133
09:00 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 15133
08:59 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 13150
08:57 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 13150
08:47 dcausse@deploy2002: Finished scap: Backport for enable page_rerender for 4th batch of wikis (T351503) (duration: 11m 50s)
08:42 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
08:41 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on mw1349.eqiad.wmnet with reason: Trying to reproduce wdat_wdt watchdog problem
08:41 moritzm: installing Exim security updates
08:40 dcausse@deploy2002: pfischer and dcausse: Continuing with sync
08:37 dcausse@deploy2002: pfischer and dcausse: Backport for enable page_rerender for 4th batch of wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:35 dcausse@deploy2002: Started scap: Backport for enable page_rerender for 4th batch of wikis (T351503)
08:12 kartik@deploy2002: Finished scap: Backport for testwiki: Enable Section translation on WPs with Content Translation available as default (T351882) (duration: 09m 10s)
08:06 kartik@deploy2002: kartik: Continuing with sync
08:04 kartik@deploy2002: kartik: Backport for testwiki: Enable Section translation on WPs with Content Translation available as default (T351882) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:03 kartik@deploy2002: Started scap: Backport for testwiki: Enable Section translation on WPs with Content Translation available as default (T351882)
07:53 moritzm: installing openjdk-8 security updates
07:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2143.codfw.wmnet with OS bookworm
06:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
06:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
06:32 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2143.codfw.wmnet with OS bookworm

2024-01-09

21:23 aqu@deploy2002: Finished deploy [airflow-dags/analytics@ea53374]: Regular airflow-dags/analytics weekly train [airflow-dags@ea53374f] (duration: 00m 28s)
21:22 aqu@deploy2002: Started deploy [airflow-dags/analytics@ea53374]: Regular airflow-dags/analytics weekly train [airflow-dags@ea53374f]
21:21 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@ea53374]: Regular airflow-dags/analytics_test weekly train [airflow-dags@ea53374f] (duration: 00m 12s)
21:21 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@ea53374]: Regular airflow-dags/analytics_test weekly train [airflow-dags@ea53374f]
21:03 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c] (test number 2 after permission error) (duration: 00m 05s)
21:03 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c] (test number 2 after permission error)
21:02 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c] (duration: 03m 33s)
20:59 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c4fed56c]
20:59 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56] (thin): Regular analytics weekly train THIN [analytics/refinery@c4fed56c] (duration: 00m 06s)
20:58 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56] (thin): Regular analytics weekly train THIN [analytics/refinery@c4fed56c]
20:58 aqu@deploy2002: Finished deploy [analytics/refinery@c4fed56]: Regular analytics weekly train [analytics/refinery@c4fed56c] (duration: 09m 06s)
20:49 eevans@cumin1002: conftool action : set/weight=0; selector: cluster=restbase,dc=codfw,name=restbase2019.codfw.wmnet
20:49 eevans@cumin1002: conftool action : set/weight=0; selector: cluster=restbase,dc=codfw,name=restbase2014.codfw.wmnet
20:49 eevans@cumin1002: conftool action : set/weight=0; selector: cluster=restbase,dc=codfw,name=restbase2013.codfw.wmnet
20:49 aqu@deploy2002: Started deploy [analytics/refinery@c4fed56]: Regular analytics weekly train [analytics/refinery@c4fed56c]
20:48 aqu: about to deploy analytics/refinery - weekly train
20:40 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.13 refs T350089
20:26 jhuneidi@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.13 refs T350089 (duration: 23m 33s)
20:03 jhuneidi@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.13 refs T350089
19:44 mutante: mwmaint1002 - rm -rf 1.42.0-wmf.7 ; mwmamint2002 - rm -rf php-1.39.0-wmf.25
19:35 mutante: mwmaint1002 - rm -rf /srv/mediawiki/php-1.40.0-wmf.17
19:33 mutante: mwmaint1002 - rm -rf /srv/mediawiki/php-1.39.0-wmf.25 after monitoring alerted about 99% disk usage on /srv
19:26 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: all wikis to 1.42.0-wmf.12 refs T350089
19:16 urandom: decommissioning cassandra, restbase2013-{a,b,c} — T352469
19:14 jhuneidi@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.13 refs T350089 (duration: 45m 48s)
18:42 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
18:40 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1001-1002].eqiad.wmnet with reason: Release v0.6.5 - cmooney@cumin1002
18:29 jhuneidi@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.13 refs T350089
18:04 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:04 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new reverse entries for mr1 -> lsw1-a2 link in codfw - cmooney@cumin1002"
18:02 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new reverse entries for mr1 -> lsw1-a2 link in codfw - cmooney@cumin1002"
18:00 cmooney@cumin1002: START - Cookbook sre.dns.netbox
17:41 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2143']
17:33 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2143']
17:31 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db2143']
17:21 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2143']
17:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti-test2004.codfw.wmnet
17:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:17 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
17:14 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
17:12 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
17:06 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts ganeti-test2004.codfw.wmnet
17:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti-test[1001-1002].eqiad.wmnet
17:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
17:04 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti-test[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
17:02 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
16:53 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts ganeti-test[1001-1002].eqiad.wmnet
16:27 jayme: restart prometheus@k8s on prometheus1005 revert GOGC to 100 (default) - T354604
16:22 mutante: phabricator - differential has been disabled (T330797)
16:11 brennen@deploy2002: Finished deploy [phabricator/deployment@369e797]: deploy to phab1004 for T354545 (duration: 00m 56s)
16:10 brennen@deploy2002: Started deploy [phabricator/deployment@369e797]: deploy to phab1004 for T354545
16:10 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudrabbit1003.wikimedia.org
16:10 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:10 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
16:09 brennen@deploy2002: Finished deploy [phabricator/deployment@369e797]: deploy to phab2002 for T354545 (duration: 00m 55s)
16:09 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudrabbit1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - taavi@cumin1002"
16:09 mutante: phabricator deployment in progress
16:08 brennen@deploy2002: Started deploy [phabricator/deployment@369e797]: deploy to phab2002 for T354545
16:08 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
16:08 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on phab2002.codfw.wmnet with reason: deployment
16:07 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: deployment
16:04 taavi@cumin1002: START - Cookbook sre.dns.netbox
15:58 taavi@cumin1002: START - Cookbook sre.hosts.decommission for hosts cloudrabbit1003.wikimedia.org
15:54 jayme: restart prometheus@k8s on prometheus1005 with GOGC=60 - T354604
15:37 akosiaris: depool and reboot mw1349 for a test T354413
15:36 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp
15:19 sukhe: restart pybal on lvs1019: T336043
15:19 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp
15:16 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
15:16 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
15:16 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
15:15 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
15:14 sukhe: restart pybal on lvs1020: T336043
15:06 TheresNoTime: done UTC afternoon backport window
15:03 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
15:02 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
15:02 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
15:01 TheresNoTime: `[samtar@mwmaint2002 ~]$ echo 'https://en.wikipedia.org/static/images/mobile/copyright/wikinews-wordmark-zh.svg' | mwscript purgeList.php` T353792
15:01 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
15:00 TheresNoTime: `[samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki bjnwikiquote --add-prefix "BROKEN " --fix` T350235
14:59 TheresNoTime: `[samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki zghwiki --add-prefix "BROKEN " --fix` T350241
14:58 samtar@deploy2002: Finished scap: Backport for zghwiki: add metanamespace (T350241), bjnwikiquote: add metanamespace (T350235) (duration: 12m 10s)
14:56 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
14:56 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
14:52 samtar@deploy2002: samtar and anzx: Continuing with sync
14:50 samtar@deploy2002: samtar and anzx: Backport for zghwiki: add metanamespace (T350241), bjnwikiquote: add metanamespace (T350235) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:46 samtar@deploy2002: Started scap: Backport for zghwiki: add metanamespace (T350241), bjnwikiquote: add metanamespace (T350235)
14:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2034.codfw.wmnet with OS bookworm
14:44 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
14:43 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp
14:42 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
14:38 TheresNoTime: `[samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki hewikinews --fix` T349581
14:38 samtar@deploy2002: Finished scap: Backport for Create draft namespace and add namespaces aliases for hewikinews (T349581) (duration: 10m 05s)
14:36 kevinbazira@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
14:35 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
14:34 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host snapshot1014.eqiad.wmnet
14:32 samtar@deploy2002: samtar and anzx: Continuing with sync
14:30 samtar@deploy2002: samtar and anzx: Backport for Create draft namespace and add namespaces aliases for hewikinews (T349581) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:28 samtar@deploy2002: Started scap: Backport for Create draft namespace and add namespaces aliases for hewikinews (T349581)
14:27 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp
14:26 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
14:26 bking@cumin2002: START - Cookbook sre.wdqs.restart
14:26 TheresNoTime: deployed patch for T350739, logging bot not working?
14:24 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2034.codfw.wmnet with reason: host reimage
14:23 samtar@deploy2002: Finished scap: Backport for [namespaces] Use correct diacritics in Romanian (duration: 14m 42s)
14:22 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams and not P{cp3066.esams.wmnet} and A:cp
14:21 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2034.codfw.wmnet with reason: host reimage
14:16 samtar@deploy2002: strainu and samtar: Continuing with sync
14:13 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase2035.codfw.wmnet
14:12 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for restbase2035.codfw.wmnet
14:12 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for restbase2035.codfw.wmnet
14:09 samtar@deploy2002: strainu and samtar: Backport for [namespaces] Use correct diacritics in Romanian synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:08 samtar@deploy2002: Started scap: Backport for [namespaces] Use correct diacritics in Romanian
14:04 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams and not P{cp3066.esams.wmnet} and A:cp
14:01 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2034.codfw.wmnet with OS bookworm
14:01 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ganeti2034.codfw.wmnet with OS bookworm
13:58 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2033.codfw.wmnet with OS bookworm
13:58 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
13:56 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1002"
13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host snapshot1014.eqiad.wmnet
13:43 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host snapshot1014.eqiad.wmnet with OS bullseye
13:41 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2034.codfw.wmnet with OS bookworm
13:37 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2033.codfw.wmnet with reason: host reimage
13:34 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2033.codfw.wmnet with reason: host reimage
13:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54575 and previous config saved to /var/cache/conftool/dbconfig/20240109-133327-root.json
13:20 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
13:18 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
13:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54574 and previous config saved to /var/cache/conftool/dbconfig/20240109-131822-root.json
13:16 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
13:14 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2033.codfw.wmnet with OS bookworm
13:13 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons.
13:10 btullis@cumin1002: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
13:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54573 and previous config saved to /var/cache/conftool/dbconfig/20240109-130317-root.json
13:00 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
13:00 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
12:58 stevemunene@cumin1002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.
12:57 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
12:57 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
12:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54572 and previous config saved to /var/cache/conftool/dbconfig/20240109-124812-root.json
12:43 moritzm: imported mwbzutils 0.1.4~wmf-1+deb11u1 for bullseye-wikimedia T325228
12:43 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw[1380-1382].eqiad.wmnet with reason: failed reimage waiting on fix
12:42 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw[1380-1382].eqiad.wmnet with reason: failed reimage waiting on fix
12:39 btullis@cumin1002: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
12:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54571 and previous config saved to /var/cache/conftool/dbconfig/20240109-123307-root.json
12:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54570 and previous config saved to /var/cache/conftool/dbconfig/20240109-121802-root.json
12:17 stevemunene@cumin1002: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons.
12:10 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams and A:cp
12:07 taavi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:07 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove wiki replica LVS VIPs - taavi@cumin1002"
12:06 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1180.eqiad.wmnet with OS bookworm
12:06 taavi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove wiki replica LVS VIPs - taavi@cumin1002"
12:04 taavi@cumin1002: START - Cookbook sre.dns.netbox
12:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54569 and previous config saved to /var/cache/conftool/dbconfig/20240109-120257-root.json
12:01 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
11:50 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:50 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update dns entry for kubestage2002.codfw.wmnet - cmooney@cumin1002"
11:50 stevemunene@cumin1002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons.
11:50 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams and A:cp
11:49 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update dns entry for kubestage2002.codfw.wmnet - cmooney@cumin1002"
11:46 cmooney@cumin1002: START - Cookbook sre.dns.netbox
11:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
11:43 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
11:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
11:38 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp
11:37 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-b8-codfw,lsw1-b8-codfw IPv6 with reason: Adding vlan to switch, precaution in case it triggers EVPN L3 bug.
11:37 btullis@cumin1002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
11:37 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on lsw1-b8-codfw,lsw1-b8-codfw IPv6 with reason: Adding vlan to switch, precaution in case it triggers EVPN L3 bug.
11:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
11:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1014.eqiad.wmnet with reason: host reimage
11:30 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1180.eqiad.wmnet with OS bookworm
11:30 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=mw2394.codfw.wmnet,cluster=jobrunner
11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1180 T354506', diff saved to https://phabricator.wikimedia.org/P54568 and previous config saved to /var/cache/conftool/dbconfig/20240109-112922-root.json
11:22 cgoubert@cumin2002: conftool action : set/pooled=no; selector: name=mw2394.codfw.wmnet
11:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host snapshot1014.eqiad.wmnet with OS bullseye
11:19 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
11:19 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
11:18 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
11:18 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
11:17 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
11:17 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
11:15 taavi@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3
11:15 taavi@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
11:14 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp
11:05 moritzm: installing exim security updates
10:54 godog: restart prometheus@k8s on prometheus1005 to see if labeldrop id will yield expected results - T354604
10:45 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ganeti2033.codfw.wmnet with OS bookworm
10:38 btullis@cumin1002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
10:22 sfaci@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
10:21 sfaci@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
10:19 btullis@cumin1002: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:datahubsearch
10:11 btullis@cumin1002: START - Cookbook sre.opensearch.roll-restart-reboot rolling restart_daemons on A:datahubsearch
10:00 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp
09:59 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti2033.codfw.wmnet with OS bookworm
09:54 oblivian@deploy2002: Finished scap: Backport for Always process media files via shellbox on k8s (T352515) (duration: 11m 03s)
09:52 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:52 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033/2034 move - ayounsi@cumin1002"
09:48 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033/2034 move - ayounsi@cumin1002"
09:47 oblivian@deploy2002: oblivian: Continuing with sync
09:46 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
09:44 oblivian@deploy2002: oblivian: Backport for Always process media files via shellbox on k8s (T352515) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
09:43 oblivian@deploy2002: Started scap: Backport for Always process media files via shellbox on k8s (T352515)
09:39 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp
09:34 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw and A:cp
09:27 oblivian@deploy2002: Finished scap: Backport for Use shellbox for djvu handling on kubernetes (T352515) (duration: 23m 56s)
09:20 oblivian@deploy2002: oblivian: Continuing with sync
09:15 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw and A:cp
09:14 moritzm: prune obsolete nginx packages from ncredir hosts after migration to new library scheme T329529
09:11 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp
09:06 arnaudb: upload wmfdb 0.1.4 from https://gitlab.wikimedia.org/repos/sre/wmfdb/-/tree/dgit/bookworm-wikimedia to fix default ca bundle
09:05 oblivian@deploy2002: oblivian: Backport for Use shellbox for djvu handling on kubernetes (T352515) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
09:03 oblivian@deploy2002: Started scap: Backport for Use shellbox for djvu handling on kubernetes (T352515)
08:59 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 45287
08:54 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 45287
08:54 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp
08:49 oblivian@deploy2002: Finished scap: Backport for Remove throttle exception (T352569) (duration: 09m 01s)
08:48 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 9902
08:47 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 9902
08:42 oblivian@deploy2002: oblivian: Continuing with sync
08:42 oblivian@deploy2002: oblivian: Backport for Remove throttle exception (T352569) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:40 oblivian@deploy2002: Started scap: Backport for Remove throttle exception (T352569)
08:22 kartik@deploy2002: Finished scap: Backport for testwiki: Enable Section translation on WPs with potential to be supported with MinT using MADLAD-400 (T353510) (duration: 15m 54s)
08:21 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2143.codfw.wmnet with OS bookworm
08:20 godog: set aside WAL for prometheus@k8s in codfw and restart - T354399
08:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54567 and previous config saved to /var/cache/conftool/dbconfig/20240109-081946-root.json
08:11 kartik@deploy2002: kartik: Continuing with sync
08:10 kartik@deploy2002: kartik: Backport for testwiki: Enable Section translation on WPs with potential to be supported with MinT using MADLAD-400 (T353510) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:06 kartik@deploy2002: Started scap: Backport for testwiki: Enable Section translation on WPs with potential to be supported with MinT using MADLAD-400 (T353510)
08:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: After a crash', diff saved to https://phabricator.wikimedia.org/P54566 and previous config saved to /var/cache/conftool/dbconfig/20240109-080558-root.json
08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54565 and previous config saved to /var/cache/conftool/dbconfig/20240109-080441-root.json
07:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: After a crash', diff saved to https://phabricator.wikimedia.org/P54564 and previous config saved to /var/cache/conftool/dbconfig/20240109-075053-root.json
07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54563 and previous config saved to /var/cache/conftool/dbconfig/20240109-074936-root.json
07:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: After a crash', diff saved to https://phabricator.wikimedia.org/P54562 and previous config saved to /var/cache/conftool/dbconfig/20240109-073548-root.json
07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54561 and previous config saved to /var/cache/conftool/dbconfig/20240109-073431-root.json
07:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: After a crash', diff saved to https://phabricator.wikimedia.org/P54560 and previous config saved to /var/cache/conftool/dbconfig/20240109-072043-root.json
07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54559 and previous config saved to /var/cache/conftool/dbconfig/20240109-071926-root.json
07:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: After a crash', diff saved to https://phabricator.wikimedia.org/P54558 and previous config saved to /var/cache/conftool/dbconfig/20240109-070538-root.json
07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54557 and previous config saved to /var/cache/conftool/dbconfig/20240109-070421-root.json
07:01 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2143.codfw.wmnet with OS bookworm
06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2151.codfw.wmnet with OS bookworm
06:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: After a crash', diff saved to https://phabricator.wikimedia.org/P54556 and previous config saved to /var/cache/conftool/dbconfig/20240109-065033-root.json
06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2151 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54555 and previous config saved to /var/cache/conftool/dbconfig/20240109-064916-root.json
06:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: After a crash', diff saved to https://phabricator.wikimedia.org/P54554 and previous config saved to /var/cache/conftool/dbconfig/20240109-063528-root.json
06:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
06:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
06:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1224', diff saved to https://phabricator.wikimedia.org/P54553 and previous config saved to /var/cache/conftool/dbconfig/20240109-062806-root.json
06:11 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2151.codfw.wmnet with OS bookworm
06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2151 T354506', diff saved to https://phabricator.wikimedia.org/P54552 and previous config saved to /var/cache/conftool/dbconfig/20240109-061015-root.json
03:11 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
03:11 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
03:11 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
03:10 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
03:10 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
03:10 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
01:22 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
01:17 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED

2024-01-08

23:16 eileen: civicrm upgraded from 16b5417b to c7304245
22:58 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
22:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
22:56 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2003.mgmt.codfw.wmnet with reboot policy GRACEFUL
22:30 ryankemper@puppetmaster1001: conftool action : set/weight=10:pooled=yes; selector: name=elastic2087\.codfw\.wmnet
22:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host sretest2003.mgmt.codfw.wmnet with reboot policy GRACEFUL
21:50 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:49 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
21:37 cjming: end of UTC late backport window
21:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
21:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
21:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
21:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
21:15 cjming@deploy2002: Finished scap: Backport for Remove android.metrics_platform.* stream definitions (T354199) (duration: 08m 17s)
21:08 cjming@deploy2002: cjming: Continuing with sync
21:08 cjming@deploy2002: cjming: Backport for Remove android.metrics_platform.* stream definitions (T354199) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:07 cjming@deploy2002: Started scap: Backport for Remove android.metrics_platform.* stream definitions (T354199)
19:30 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
19:28 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy FORCED
19:27 taavi: make puppet re-generate empty envoy config file on testreduce1002 T345220
19:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
19:13 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
19:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
19:09 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
19:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
19:04 sukhe: running authdns-update for CR 988684: T345220
19:04 sukhe: running authdns-update for CR 988684: T336043
18:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:34 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:27 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:25 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:21 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:19 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:12 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
18:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
17:56 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
17:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mw2394.mgmt.codfw.wmnet with reboot policy GRACEFUL
17:43 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 06m 17s)
17:36 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 06m 21s)
17:34 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
17:18 godog: wipe prometheus@k8s eqiad WAL and restart - T354399
17:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
17:15 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:15 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
17:14 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
17:14 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp
17:12 ladsgroup@deploy2002: Finished scap: Backport for Undeploy Listings extension part III (T253216) (duration: 08m 01s)
17:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
17:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:06 ladsgroup@deploy2002: ladsgroup: Continuing with sync
17:06 ladsgroup@deploy2002: ladsgroup: Backport for Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
17:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
17:04 ladsgroup@deploy2002: Started scap: Backport for Undeploy Listings extension part III (T253216)
17:04 ladsgroup@deploy2002: Finished scap: Backport for Undeploy Listings extension part III (T253216) (duration: 12m 24s)
17:00 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
16:57 ladsgroup@deploy2002: ladsgroup: Continuing with sync
16:54 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1377.eqiad.wmnet with OS bullseye
16:53 ladsgroup@deploy2002: ladsgroup: Backport for Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2034.codfw.wmnet
16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2034.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
16:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
16:51 ladsgroup@deploy2002: Started scap: Backport for Undeploy Listings extension part III (T253216)
16:49 ladsgroup@deploy2002: Finished scap: Backport for Undeploy Listings extension part III (T253216) (duration: 08m 47s)
16:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
16:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2034.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
16:46 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp
16:44 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin and not P{cp[5030,5032].eqsin.wmnet} and A:cp
16:43 ladsgroup@deploy2002: ladsgroup: Continuing with sync
16:42 pt1979@cumin2002: START - Cookbook sre.dns.netbox
16:42 ladsgroup@deploy2002: ladsgroup: Backport for Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
16:41 ladsgroup@deploy2002: Started scap: Backport for Undeploy Listings extension part III (T253216)
16:37 pt1979@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2034.codfw.wmnet
16:36 btullis@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dbstore1008.eqiad.wmnet on all recursors
16:36 btullis@cumin1002: START - Cookbook sre.dns.wipe-cache dbstore1008.eqiad.wmnet on all recursors
16:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
16:35 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:35 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove unwanted AAAA records from new dbstore hosts - btullis@cumin1002"
16:34 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove unwanted AAAA records from new dbstore hosts - btullis@cumin1002"
16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2033.codfw.wmnet
16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
16:32 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti2033.codfw.wmnet decommissioned, removing all IPs except the asset tag one - pt1979@cumin2002"
16:30 btullis@cumin1002: START - Cookbook sre.dns.netbox
16:25 pt1979@cumin2002: START - Cookbook sre.dns.netbox
16:25 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and not P{cp[5030,5032].eqsin.wmnet} and A:cp
16:25 ladsgroup@deploy2002: Finished scap: Backport for Undeploy Listings extension part III (T253216) (duration: 24m 06s)
16:24 taavi: lvs1018: sudo ipvsadm --delete-service --tcp-service 208.80.154.243:3311 (and all the way to :3318) - T346947
16:23 taavi: lvs1018: sudo ipvsadm --delete-service --tcp-service 208.80.154.242:3311 (and all the way to :3318) - T346947
16:21 taavi: lvs1020: sudo ipvsadm --delete-service --tcp-service 208.80.154.243:3311 (and all the way to :3318) - T346947
16:20 taavi: lvs1020: sudo ipvsadm --delete-service --tcp-service 208.80.154.242:3311 (and all the way to :3318) - T346947
16:18 pt1979@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2033.codfw.wmnet
16:15 taavi: restart pybal on lvs1018 - T346947
16:14 ladsgroup@deploy2002: ladsgroup: Continuing with sync
16:14 ladsgroup@deploy2002: ladsgroup: Backport for Undeploy Listings extension part III (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
16:09 taavi: restart pybal on lvs1020 - T346947
16:01 ladsgroup@deploy2002: Started scap: Backport for Undeploy Listings extension part III (T253216)
15:59 sfaci@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
15:59 sfaci@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
15:58 ladsgroup@deploy2002: Finished scap: Backport for Undeploy listing extension part II (T253216) (duration: 08m 40s)
15:57 sfaci@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
15:57 sfaci@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
15:52 ladsgroup@deploy2002: ladsgroup: Continuing with sync
15:51 ladsgroup@deploy2002: ladsgroup: Backport for Undeploy listing extension part II (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:49 ladsgroup@deploy2002: Started scap: Backport for Undeploy listing extension part II (T253216)
15:48 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw1377.eqiad.wmnet with reason: reboot debugging
15:48 cgoubert@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw1377.eqiad.wmnet with reason: reboot debugging
15:47 ladsgroup@deploy2002: Finished scap: Backport for Undeploy Listings extension, part I (T253216) (duration: 08m 22s)
15:46 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
15:46 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:45 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
15:41 ladsgroup@deploy2002: ladsgroup: Continuing with sync
15:40 ladsgroup@deploy2002: ladsgroup: Backport for Undeploy Listings extension, part I (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:40 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:38 ladsgroup@deploy2002: Started scap: Backport for Undeploy Listings extension, part I (T253216)
15:35 claime: Draining and cordoning kubestage2002.codfw.wmnet - T352883
15:32 krinkle@deploy2002: Finished scap: Backport for Fix parsing logic when comments or hidden characters are present (T354385) (duration: 07m 52s)
15:26 krinkle@deploy2002: krinkle: Continuing with sync
15:26 krinkle@deploy2002: krinkle: Backport for Fix parsing logic when comments or hidden characters are present (T354385) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
15:24 krinkle@deploy2002: Started scap: Backport for Fix parsing logic when comments or hidden characters are present (T354385)
14:46 urbanecm@deploy2002: Finished scap: Backport for Add agent.app_install_id to android.product_metrics.* streams (T353680), Remove partial migration of EditAttemptStep instrument (T351335), Add new stream names to the config variable (T353297), agent.app_ -> agent_app_ in android.product_metrics.* streams (T353680) (duration: 10m 22s)
14:40 urbanecm@deploy2002: urbanecm and phuedx and ksarabia and sfaci: Continuing with sync
14:37 urbanecm@deploy2002: urbanecm and phuedx and ksarabia and sfaci: Backport for Add agent.app_install_id to android.product_metrics.* streams (T353680), Remove partial migration of EditAttemptStep instrument (T351335), Add new stream names to the config variable (T353297), agent.app_ -> agent_app_ in android.product_metrics.* streams (T353680) synce
14:35 urbanecm@deploy2002: Started scap: Backport for Add agent.app_install_id to android.product_metrics.* streams (T353680), Remove partial migration of EditAttemptStep instrument (T351335), Add new stream names to the config variable (T353297), agent.app_ -> agent_app_ in android.product_metrics.* streams (T353680)
14:34 urbanecm@deploy2002: Sync cancelled.
14:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
14:27 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on debmonitor2003.codfw.wmnet with reason: WIP
14:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54548 and previous config saved to /var/cache/conftool/dbconfig/20240108-141717-root.json
14:14 urbanecm@deploy2002: urbanecm and phuedx and ksarabia and sfaci: Backport for Add agent.app_install_id to android.product_metrics.* streams (T353680), Remove partial migration of EditAttemptStep instrument (T351335), Add new stream names to the config variable (T353297) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:13 urbanecm@deploy2002: Started scap: Backport for Add agent.app_install_id to android.product_metrics.* streams (T353680), Remove partial migration of EditAttemptStep instrument (T351335), Add new stream names to the config variable (T353297)
14:12 urbanecm@deploy2002: Finished scap: Backport for enable page_rerender for 3rd batch of wikis (T351503) (duration: 09m 35s)
14:06 urbanecm@deploy2002: pfischer and urbanecm: Continuing with sync
14:04 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
14:04 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
14:04 urbanecm@deploy2002: pfischer and urbanecm: Backport for enable page_rerender for 3rd batch of wikis (T351503) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:02 urbanecm@deploy2002: Started scap: Backport for enable page_rerender for 3rd batch of wikis (T351503)
14:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54547 and previous config saved to /var/cache/conftool/dbconfig/20240108-140212-root.json
14:01 moritzm: installing curl security updates
13:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54546 and previous config saved to /var/cache/conftool/dbconfig/20240108-134707-root.json
13:33 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
13:33 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
13:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54545 and previous config saved to /var/cache/conftool/dbconfig/20240108-133202-root.json
13:32 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
13:31 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
13:30 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 100%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54544 and previous config saved to /var/cache/conftool/dbconfig/20240108-133016-root.json
13:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54543 and previous config saved to /var/cache/conftool/dbconfig/20240108-131657-root.json
13:15 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 75%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54542 and previous config saved to /var/cache/conftool/dbconfig/20240108-131511-root.json
13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54541 and previous config saved to /var/cache/conftool/dbconfig/20240108-130152-root.json
13:00 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 50%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54540 and previous config saved to /var/cache/conftool/dbconfig/20240108-130006-root.json
12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54539 and previous config saved to /var/cache/conftool/dbconfig/20240108-124647-root.json
12:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1224.eqiad.wmnet with OS bookworm
12:45 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 25%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54538 and previous config saved to /var/cache/conftool/dbconfig/20240108-124501-root.json
12:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 10%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54537 and previous config saved to /var/cache/conftool/dbconfig/20240108-122956-root.json
12:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
12:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
12:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 5%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54536 and previous config saved to /var/cache/conftool/dbconfig/20240108-121451-root.json
12:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS bookworm
12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1224 T354506', diff saved to https://phabricator.wikimedia.org/P54535 and previous config saved to /var/cache/conftool/dbconfig/20240108-120759-root.json
12:03 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 45287
12:02 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 45287
12:02 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 35847
12:02 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 35847
12:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9902
12:00 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 9902
12:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2117.codfw.wmnet with OS bookworm
11:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 1%: Upgrade to 10.6.16 and bookworm', diff saved to https://phabricator.wikimedia.org/P54534 and previous config saved to /var/cache/conftool/dbconfig/20240108-115946-root.json
11:57 ladsgroup@deploy2002: Finished scap: Backport for Disable Listings extension everywhere except rowikivoyage (T253216) (duration: 08m 43s)
11:50 ladsgroup@deploy2002: ladsgroup: Continuing with sync
11:50 ladsgroup@deploy2002: ladsgroup: Backport for Disable Listings extension everywhere except rowikivoyage (T253216) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:48 ladsgroup@deploy2002: Started scap: Backport for Disable Listings extension everywhere except rowikivoyage (T253216)
11:45 taavi@deploy2002: Finished scap: Backport for OATHAuthServices: Fix service name (T354505), Fix disabling two-factor authentication (T354505) (duration: 09m 21s)
11:39 taavi@deploy2002: taavi: Continuing with sync
11:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2117.codfw.wmnet with reason: host reimage
11:38 taavi@deploy2002: taavi: Backport for OATHAuthServices: Fix service name (T354505), Fix disabling two-factor authentication (T354505) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:36 taavi@deploy2002: Started scap: Backport for OATHAuthServices: Fix service name (T354505), Fix disabling two-factor authentication (T354505)
11:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2117.codfw.wmnet with reason: host reimage
11:29 ladsgroup@deploy2002: Finished scap: Backport for Stop writing to the old columns of pagelinks in testwiki (T352010) (duration: 10m 02s)
11:23 ladsgroup@deploy2002: ladsgroup: Continuing with sync
11:20 ladsgroup@deploy2002: ladsgroup: Backport for Stop writing to the old columns of pagelinks in testwiki (T352010) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:19 ladsgroup@deploy2002: Started scap: Backport for Stop writing to the old columns of pagelinks in testwiki (T352010)
11:17 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2117.codfw.wmnet with OS bookworm
11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2117 T354506', diff saved to https://phabricator.wikimedia.org/P54533 and previous config saved to /var/cache/conftool/dbconfig/20240108-111452-root.json
10:36 XioNoX: repool eqsin - T332395
10:33 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:21 ladsgroup@deploy2002: Finished scap: Backport for styles: Replace obsolete WikimediaUI Base var with Codex alias (duration: 07m 32s)
10:20 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
10:20 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
10:15 ladsgroup@deploy2002: volker-e and ladsgroup: Continuing with sync
10:15 ladsgroup@deploy2002: volker-e and ladsgroup: Backport for styles: Replace obsolete WikimediaUI Base var with Codex alias synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
10:14 ladsgroup@deploy2002: Started scap: Backport for styles: Replace obsolete WikimediaUI Base var with Codex alias
10:11 ladsgroup@deploy2002: Finished scap: Backport for Set commonswiki pagelinks migration stage to READ NEW (T351237) (duration: 08m 52s)
10:05 ladsgroup@deploy2002: ladsgroup: Continuing with sync
10:04 ladsgroup@deploy2002: ladsgroup: Backport for Set commonswiki pagelinks migration stage to READ NEW (T351237) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
10:02 ladsgroup@deploy2002: Started scap: Backport for Set commonswiki pagelinks migration stage to READ NEW (T351237)
09:54 XioNoX: asw1-eqsin> request system reboot - T332395
09:32 Emperor: reboot ms-be2074-80 before adding them to the rings T353149
09:32 Emperor: reboot ms-be1072-82 before adding them to the rings T353149
09:24 XioNoX: start install process on asw1-eqsin - T332395
09:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 35 hosts with reason: eqsin switch upgrade
09:04 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 35 hosts with reason: eqsin switch upgrade
09:03 XioNoX: depool eqsin for switch upgrade - T332395
08:27 xSavitar: UTC morning backport window done.
08:26 derick@deploy2002: Finished scap: Backport for wmf-config: Remove unused wgStatsCacheType setting (T336004) (duration: 09m 11s)
08:20 derick@deploy2002: derick and d3r1ck01: Continuing with sync
08:18 derick@deploy2002: derick and d3r1ck01: Backport for wmf-config: Remove unused wgStatsCacheType setting (T336004) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:17 derick@deploy2002: Started scap: Backport for wmf-config: Remove unused wgStatsCacheType setting (T336004)

2024-01-06

22:27 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
22:27 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
22:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.

2024-01-05

23:49 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit.wikimedia.org only this deploy) (duration: 00m 08s)
23:49 thcipriani@deploy2002: Started deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit.wikimedia.org only this deploy)
23:31 thcipriani@deploy2002: Finished deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit-replicas only this deploy) (duration: 00m 06s)
23:31 thcipriani@deploy2002: Started deploy [gerrit/gerrit@de3a994]: Removing survey banner gerrit:987995 (gerrit-replicas only this deploy)
23:25 thcipriani: deploying gerrit to remove survey banner https://gerrit.wikimedia.org/r/987995 (no downtime needed)
19:29 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for restbase2034.codfw.wmnet
19:29 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for restbase2034.codfw.wmnet
19:23 eevans@cumin1002: conftool action : set/weight=10; selector: cluster=restbase,dc=codfw,name=restbase2034.codfw.wmnet
19:07 mutante: vrts1001 - sudo systemctl start clamav-daemon
17:14 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
17:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
16:43 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
16:42 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
16:40 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
16:30 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
16:29 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
16:19 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
15:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:31 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
15:30 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
14:50 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
14:50 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
14:45 milimetric@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
14:45 milimetric@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
14:43 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:42 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:38 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:37 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:14 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
14:14 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
13:42 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
13:41 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
13:23 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
13:23 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
11:56 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw1379.eqiad.wmnet
11:49 kamila@cumin1002: START - Cookbook sre.hosts.reboot-single for host mw1379.eqiad.wmnet
09:26 moritzm: installing 5.10.205 kernels on Bullseye hosts
09:15 _joe_: upgrading conftool across the fleet
08:01 moritzm: installing 6.1.69 kernels on Bookworm hosts
01:27 zabe: zabe@mwmaint2002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=arzwiki --logwiki=metawiki 'WanderingPlaywrite' 'WanderingPlaywright' # T354397
00:59 cwhite: restarted prometheus@k8s on prometheus1006 and backed up the wal for OOM loop investigation
00:52 cwhite: restarted prometheus@k8s on prometheus1005 and backed up the wal for OOM loop investigation

2024-01-04

23:10 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
23:10 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:34 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:33 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:33 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:31 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:31 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:29 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:29 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:29 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:29 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:25 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
22:25 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
22:24 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
22:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
22:22 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
22:22 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
22:22 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
22:21 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
22:21 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
22:21 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
22:00 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
22:00 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
21:38 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
21:38 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
21:27 brennen: end of utc late backport window
21:26 brennen@deploy2002: Finished scap: Backport for Ensure all non-okay statuses from ::getImageContents have a message (T354374) (duration: 08m 01s)
21:20 brennen@deploy2002: brennen and dreamyjazz: Continuing with sync
21:19 brennen@deploy2002: brennen and dreamyjazz: Backport for Ensure all non-okay statuses from ::getImageContents have a message (T354374) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:18 brennen@deploy2002: Started scap: Backport for Ensure all non-okay statuses from ::getImageContents have a message (T354374)
21:17 brennen@deploy2002: Finished scap: Backport for Check for invalid JSON on a good response from PhotoDNA (T354370) (duration: 07m 57s)
21:11 brennen@deploy2002: brennen and dreamyjazz: Continuing with sync
21:10 brennen@deploy2002: brennen and dreamyjazz: Backport for Check for invalid JSON on a good response from PhotoDNA (T354370) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:09 brennen@deploy2002: Started scap: Backport for Check for invalid JSON on a good response from PhotoDNA (T354370)
20:41 ryankemper: [apifeatureusage] T350703 Restarted `logstash` on `apifeatureusage[1,2]001`
20:39 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.12 refs T350088
20:30 mutante: mwmaint2002 - /usr/local/sbin/sync-home-mwmaint after gerrit:987778
20:20 dduvall@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.12 refs T350088 (duration: 06m 09s)
20:16 ejegg: standalone (payments listener) SmashPig upgraded from fc74ccca to 20d6434e
20:13 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.12 refs T350088
20:03 mutante: releases2003 - systemctl status rsync-srv-org-wikimedia-releases-releases2003.codfw.wmnet after gerrit:987436
20:01 mutante: releases2003 - systemctl start rsync-srv-patches-releases2003.codfw.wmnet after gerrit:987436
19:59 brett: restarting pybal on lvs5006 for testing purposes - T353760
19:59 mutante: releases1003 - systemctl start rsync-srv-patches-releases-primary after gerrit:987436
19:57 dcausse: repooling wdqs1019
19:52 ebernhardson@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
19:51 ebernhardson@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
19:49 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.12 refs T350088
19:47 mutante: deploy1002 - systemctl start rsync-patches_module after gerrit:987436
19:32 dduvall@deploy2002: Finished scap: Backport for Revise logic for creating compact links button on Vector 2022 (T353850) (duration: 07m 58s)
19:26 dduvall@deploy2002: jdlrobson and dduvall: Continuing with sync
19:26 dduvall@deploy2002: jdlrobson and dduvall: Backport for Revise logic for creating compact links button on Vector 2022 (T353850) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
19:25 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
19:25 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
19:24 dduvall@deploy2002: Started scap: Backport for Revise logic for creating compact links button on Vector 2022 (T353850)
19:22 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
19:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
19:04 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
19:04 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
18:46 sukhe: [second time] mx2001: exiqgrep -i -r w*@gmail.com | xargs exim -Mrm
18:03 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
17:57 sukhe: mx2001: exiqgrep -i -r w*@gmail.com | xargs exim -Mrm
17:46 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
17:43 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
17:42 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
17:42 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
17:35 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
17:34 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
17:28 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
17:10 oblivian@puppetmaster2001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=kubernetes,service=kubesvc,name=mw1377.*
16:43 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
16:42 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
16:42 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
16:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
16:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
16:41 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
16:36 volans@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw1378.eqiad.wmnet
16:25 volans@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw1378.eqiad.wmnet
16:00 volans@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts mw1378.eqiad.wmnet
15:59 volans@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw1378.eqiad.wmnet
15:58 moritzm: installing libdatetime-timezone-perl updates
15:51 moritzm: rolling restart of FPM/apache on mw canaries to pick up curl updates
15:48 XioNoX: repool esams - T346779
15:46 volans@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw1378.eqiad.wmnet
15:38 XioNoX: undrain esams-eqiad transport - T346779
15:37 XioNoX: re-enable peering/transit on cr1-esams - T346779
15:35 volans@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw1378.eqiad.wmnet
15:30 XioNoX: reboot fpc0 on cr1-esams - T346779
15:29 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw1378.mgmt.eqiad.wmnet with reboot policy GRACEFUL
15:26 XioNoX: disable peering/transit on cr1-esams for linecard reboot - T346779
15:19 volans: running sre.hosts.provision for mw1378 - T351074
15:19 volans@cumin2002: START - Cookbook sre.hosts.provision for host mw1378.mgmt.eqiad.wmnet with reboot policy GRACEFUL
15:16 XioNoX: drain esams-eqiad transport - T346779
15:14 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:13 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:12 moritzm: installing curl security updates
15:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:08 volans: rebooting mw1378 (downtimed and depooled) to debug reboot issues afer reimage - T351074
15:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:07 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:07 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:05 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:04 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
15:04 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
15:01 XioNoX: depool esams for router work - T346779
15:00 tchanders@deploy2002: Finished scap: Backport for enable page_rerender for 2nd batch: dewiki, frwiktionary, and kuwiktionary (duration: 17m 55s)
14:59 volans: rebooting mw1378 (downtimed and depooled) to debug reboot issues afer reimage - T351074
14:56 volans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw1378.eqiad.wmnet with reason: WIP hosts to be setup
14:56 volans@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on mw1378.eqiad.wmnet with reason: WIP hosts to be setup
14:54 tchanders@deploy2002: pfischer and tchanders: Continuing with sync
14:45 tchanders@deploy2002: pfischer and tchanders: Backport for enable page_rerender for 2nd batch: dewiki, frwiktionary, and kuwiktionary synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:42 tchanders@deploy2002: Started scap: Backport for enable page_rerender for 2nd batch: dewiki, frwiktionary, and kuwiktionary
14:40 tchanders@deploy2002: Finished scap: Backport for Attempt to send original file to PhotoDNA if no thumbnail (T353854) (duration: 09m 25s)
14:34 tchanders@deploy2002: tchanders and dreamyjazz: Continuing with sync
14:34 tchanders@deploy2002: tchanders and dreamyjazz: Backport for Attempt to send original file to PhotoDNA if no thumbnail (T353854) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:30 tchanders@deploy2002: Started scap: Backport for Attempt to send original file to PhotoDNA if no thumbnail (T353854)
14:25 tchanders@deploy2002: Finished scap: Backport for Attempt to send original file to PhotoDNA if no thumbnail (T353854) (duration: 09m 24s)
14:20 tchanders@deploy2002: dreamyjazz and tchanders: Continuing with sync
14:20 tchanders@deploy2002: dreamyjazz and tchanders: Backport for Attempt to send original file to PhotoDNA if no thumbnail (T353854) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:16 tchanders@deploy2002: Started scap: Backport for Attempt to send original file to PhotoDNA if no thumbnail (T353854)
14:12 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:12 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:09 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:09 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:09 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:03 XioNoX: repool drmrs - T354340
14:01 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:00 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:00 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
13:57 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 2686
13:56 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 2686
13:53 moritzm: installing libssh security updates
13:24 dcausse: restarting blazegraph on wdqs1019 (stuck with high thread count)
13:07 zabe@deploy2002: Finished scap: Backport for Revert "Get blocks from DatabaseBlockStore instead of doing our own query" (T353620), Revert "Support new block schema" (T354298) (duration: 10m 06s)
13:02 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host mw1377.eqiad.wmnet
13:02 XioNoX: depool drmrs for router work - T354340
13:01 zabe@deploy2002: zabe: Continuing with sync
13:00 zabe@deploy2002: zabe: Backport for Revert "Get blocks from DatabaseBlockStore instead of doing our own query" (T353620), Revert "Support new block schema" (T354298) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
12:56 zabe@deploy2002: Started scap: Backport for Revert "Get blocks from DatabaseBlockStore instead of doing our own query" (T353620), Revert "Support new block schema" (T354298)
12:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 63296
12:52 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 63296
12:10 kamila@cumin1002: START - Cookbook sre.hosts.reboot-single for host mw1377.eqiad.wmnet
12:04 moritzm: installing lua5.3 security updates
11:52 moritzm: installing libde265 security updates
11:35 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1379.eqiad.wmnet with OS bullseye
11:19 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
11:16 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
11:01 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
10:51 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
10:33 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
10:32 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
10:17 akosiaris: bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi) take #3
10:17 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
09:57 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
09:38 akosiaris: delete mw1377-mw1383 from eqiad wikikube nodes
09:38 akosiaris: bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi) take #2
09:36 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
09:36 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
09:22 akosiaris: bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi)
09:22 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
09:13 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:13 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:12 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:11 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:09 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
08:49 ladsgroup@deploy2002: Finished scap: Backport for Update virtual domain for url shortener (duration: 12m 35s)
08:43 ladsgroup@deploy2002: ladsgroup: Continuing with sync
08:38 ladsgroup@deploy2002: ladsgroup: Backport for Update virtual domain for url shortener synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:36 ladsgroup@deploy2002: Started scap: Backport for Update virtual domain for url shortener
08:34 ladsgroup@deploy2002: Finished scap: Backport for Add virtual domain config for reading lists extension (T353948) (duration: 09m 05s)
08:28 ladsgroup@deploy2002: ladsgroup: Continuing with sync
08:27 ladsgroup@deploy2002: ladsgroup: Backport for Add virtual domain config for reading lists extension (T353948) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
08:25 ladsgroup@deploy2002: Started scap: Backport for Add virtual domain config for reading lists extension (T353948)
07:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1151.eqiad.wmnet with OS bookworm
06:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
06:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
06:28 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1151.eqiad.wmnet with OS bookworm
03:49 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.

2024-01-03

23:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
23:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
23:50 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
23:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow
23:33 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
23:24 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
23:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
23:18 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1383.eqiad.wmnet with OS bullseye
23:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1380.eqiad.wmnet with OS bullseye
23:14 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1382.eqiad.wmnet with OS bullseye
23:12 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1378.eqiad.wmnet with OS bullseye
23:10 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1381.eqiad.wmnet with OS bullseye
23:07 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1379.eqiad.wmnet with OS bullseye
23:02 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
23:01 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
22:59 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
22:59 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
22:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
22:54 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
22:54 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
22:52 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
22:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
22:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
22:40 bking@cumin2002: START - Cookbook sre.wdqs.restart
22:38 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1383.eqiad.wmnet with OS bullseye
22:38 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1382.eqiad.wmnet with OS bullseye
22:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1381.eqiad.wmnet with OS bullseye
22:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1380.eqiad.wmnet with OS bullseye
22:37 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
22:36 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye
22:36 bking@cumin2002: START - Cookbook sre.wdqs.restart
22:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2087.codfw.wmnet with OS bullseye
22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2087.codfw.wmnet with reason: host reimage
21:59 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2087.codfw.wmnet with reason: host reimage
21:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1377.eqiad.wmnet with OS bullseye
21:48 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: broken reimage
21:47 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: broken reimage
21:43 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2087.codfw.wmnet with OS bullseye
21:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
21:34 zabe@deploy2002: Finished scap: Backport for Update mediawiki/mediawiki-codesniffer to 42.0.0 (duration: 10m 34s)
21:33 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
21:28 zabe@deploy2002: zabe: Continuing with sync
21:27 zabe@deploy2002: zabe: Backport for Update mediawiki/mediawiki-codesniffer to 42.0.0 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:24 zabe@deploy2002: Started scap: Backport for Update mediawiki/mediawiki-codesniffer to 42.0.0
21:19 TheresNoTime: UTC late backport window done
21:18 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
21:14 samtar@deploy2002: Finished scap: Backport for Add "patroller" user group to testwiki (T354063) (duration: 12m 19s)
21:08 samtar@deploy2002: novemlinguae and samtar: Continuing with sync
21:06 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1383.eqiad.wmnet with OS bullseye
21:06 samtar@deploy2002: novemlinguae and samtar: Backport for Add "patroller" user group to testwiki (T354063) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
21:04 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1382.eqiad.wmnet with OS bullseye
21:02 samtar@deploy2002: Started scap: Backport for Add "patroller" user group to testwiki (T354063)
20:59 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1381.eqiad.wmnet with OS bullseye
20:47 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1380.eqiad.wmnet with OS bullseye
20:45 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1379.eqiad.wmnet with OS bullseye
20:37 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1378.eqiad.wmnet with OS bullseye
20:34 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1377.eqiad.wmnet with OS bullseye
20:17 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2450.codfw.wmnet with OS bullseye
20:15 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2443.codfw.wmnet with OS bullseye
20:11 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2451.codfw.wmnet with OS bullseye
20:04 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2442.codfw.wmnet with OS bullseye
20:00 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
19:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2436.codfw.wmnet with OS bullseye
19:57 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
19:57 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2450.codfw.wmnet with reason: host reimage
19:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2440.codfw.wmnet with OS bullseye
19:55 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2443.codfw.wmnet with reason: host reimage
19:53 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2437.codfw.wmnet with OS bullseye
19:52 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
19:51 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2451.codfw.wmnet with reason: host reimage
19:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2451.codfw.wmnet with reason: host reimage
19:51 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2450.codfw.wmnet with reason: host reimage
19:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2443.codfw.wmnet with reason: host reimage
19:50 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage
19:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage
19:49 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1381.eqiad.wmnet with reason: host reimage
19:44 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2442.codfw.wmnet with reason: host reimage
19:42 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
19:39 mutante: root@doc2002: /usr/local/sbin/sync-doc-host-data-sync after gerrit:987406
19:39 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
19:38 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2442.codfw.wmnet with reason: host reimage
19:36 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage
19:36 kamila@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2440.codfw.wmnet with reason: host reimage
19:36 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2436.codfw.wmnet with reason: host reimage
19:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1383.eqiad.wmnet with OS bullseye
19:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2440.codfw.wmnet with reason: host reimage
19:35 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage
19:35 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1382.eqiad.wmnet with OS bullseye
19:34 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1381.eqiad.wmnet with OS bullseye
19:33 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2451.codfw.wmnet with OS bullseye
19:33 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2437.codfw.wmnet with reason: host reimage
19:33 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2450.codfw.wmnet with OS bullseye
19:32 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2443.codfw.wmnet with OS bullseye
19:31 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
19:28 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2437.codfw.wmnet with reason: host reimage
19:28 kamila@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
19:26 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2436.codfw.wmnet with reason: host reimage
19:26 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1378.eqiad.wmnet with reason: host reimage
19:25 kamila@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1377.eqiad.wmnet with reason: host reimage
19:22 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1380.eqiad.wmnet with OS bullseye
19:21 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye
19:19 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2442.codfw.wmnet with OS bullseye
19:18 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2440.codfw.wmnet with OS bullseye
19:11 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1378.eqiad.wmnet with OS bullseye
19:11 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw1377.eqiad.wmnet with OS bullseye
19:10 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2437.codfw.wmnet with OS bullseye
19:08 kamila@cumin1002: START - Cookbook sre.hosts.reimage for host mw2436.codfw.wmnet with OS bullseye
18:27 brennen@deploy2002: Finished deploy [phabricator/deployment@369e797]: deploy to phab2002 for T334519 (duration: 00m 27s)
18:27 brennen@deploy2002: Started deploy [phabricator/deployment@369e797]: deploy to phab2002 for T334519
18:27 brennen: running an essentially no-op phab2002 deploy
18:11 dduvall@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.12 refs T350088 (duration: 07m 23s)
18:03 dduvall@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.12 refs T350088
17:06 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and not P{cp4044.ulsfo.wmnet} and A:cp
16:45 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and not P{cp4044.ulsfo.wmnet} and A:cp
16:33 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and not P{cp4050.ulsfo.wmnet,cp4051.ulsfo.wmnet} and A:cp
16:27 stevemunene@deploy2002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
16:27 stevemunene@deploy2002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
16:27 stevemunene@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
16:26 stevemunene@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
16:26 stevemunene@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
16:26 stevemunene@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
16:25 stevemunene@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
16:25 stevemunene@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
16:24 stevemunene@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
16:24 stevemunene@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
16:23 stevemunene@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
16:22 stevemunene@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
16:16 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and not P{cp4050.ulsfo.wmnet,cp4051.ulsfo.wmnet} and A:cp
16:11 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp3066.esams.wmnet} and A:cp
16:10 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp3066.esams.wmnet} and A:cp
15:39 moritzm: rebuild md RAIDs after disk swap T353324
14:55 TheresNoTime: UTC afternoon backport window done
14:54 samtar@deploy2002: Finished scap: Backport for zhwikinews: update wordmark (T353792) (duration: 09m 11s)
14:48 samtar@deploy2002: anzx and samtar: Continuing with sync
14:46 samtar@deploy2002: anzx and samtar: Backport for zhwikinews: update wordmark (T353792) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:45 samtar@deploy2002: Started scap: Backport for zhwikinews: update wordmark (T353792)
14:43 samtar@deploy2002: Finished scap: Backport for aswikiquote: change wordmark and update logo (T353934) (duration: 07m 51s)
14:38 samtar@deploy2002: samtar and anzx: Continuing with sync
14:37 samtar@deploy2002: samtar and anzx: Backport for aswikiquote: change wordmark and update logo (T353934) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:36 samtar@deploy2002: Started scap: Backport for aswikiquote: change wordmark and update logo (T353934)
14:34 samtar@deploy2002: Finished scap: Backport for Edit Recovery: fix typo in expiry field name (T347673) (duration: 07m 46s)
14:29 samtar@deploy2002: samtar: Continuing with sync
14:28 samtar@deploy2002: samtar: Backport for Edit Recovery: fix typo in expiry field name (T347673) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:27 samtar@deploy2002: Started scap: Backport for Edit Recovery: fix typo in expiry field name (T347673)
14:18 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:18 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:17 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:17 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:11 samtar@deploy2002: Finished scap: Backport for zhwikivoyage: Enable block feature for abusefilter (T353604), ganwiki: Add transwiki import sources (T354000) (duration: 09m 58s)
14:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:06 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:06 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
14:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
14:05 samtar@deploy2002: samtar and stang: Continuing with sync
14:03 moritzm: installing qemu security updates
14:02 samtar@deploy2002: samtar and stang: Backport for zhwikivoyage: Enable block feature for abusefilter (T353604), ganwiki: Add transwiki import sources (T354000) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:01 samtar@deploy2002: Started scap: Backport for zhwikivoyage: Enable block feature for abusefilter (T353604), ganwiki: Add transwiki import sources (T354000)
13:32 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Nick Ifeajika out of all services on: 2220 hosts
13:31 root@cumin2002: START - Cookbook sre.idm.logout Logging Nick Ifeajika out of all services on: 2220 hosts
13:29 moritzm: installing Java 8/11 security updates
12:34 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
12:34 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
12:29 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot-master (exit_code=0) rolling restart_daemons on A:maps-master
12:28 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot-master rolling restart_daemons on A:maps-master
12:23 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
12:18 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
12:14 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
12:14 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
12:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
12:08 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
12:02 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
12:02 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
12:01 moritzm: installing gnutls28 security updates on buster
11:47 oblivian@deploy2002: Finished scap: Backport for Fix timeouts detection on mw on k8s jobrunners (T354229) (duration: 11m 38s)
11:44 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:44 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:41 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:41 oblivian@deploy2002: oblivian: Continuing with sync
11:40 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:39 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:37 oblivian@deploy2002: oblivian: Backport for Fix timeouts detection on mw on k8s jobrunners (T354229) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:36 oblivian@deploy2002: Started scap: Backport for Fix timeouts detection on mw on k8s jobrunners (T354229)
11:31 oblivian@deploy2002: Finished scap: Backport for Disable things that don't work on k8s when on k8s (duration: 15m 29s)
11:25 oblivian@deploy2002: oblivian: Continuing with sync
11:25 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:24 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:24 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:24 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:24 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:23 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:23 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:22 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:22 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:21 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:20 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
11:20 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
11:18 oblivian@deploy2002: oblivian: Backport for Disable things that don't work on k8s when on k8s synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
11:16 oblivian@deploy2002: Started scap: Backport for Disable things that don't work on k8s when on k8s
11:05 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:56 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:53 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:51 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:51 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:48 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:48 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:46 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:46 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:35 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:23 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:16 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:15 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:11 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:11 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:10 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
10:09 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
10:08 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:57 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:40 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:39 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:36 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:36 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:35 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:35 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:33 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:33 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:32 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:32 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:31 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:31 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:21 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:21 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:21 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:13 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:10 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:10 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:08 pfischer@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
09:07 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
09:03 pfischer@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
01:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
01:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
01:13 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
01:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
00:55 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
00:55 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
00:08 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
00:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.

2024-01-02

22:42 urbanecm: mwmaint2002: Restart `mwscript extensions/GrowthExperiments/maintenance/reassignMentees.php --wiki=enwiki --mentor 'FormalDude' --performer 'Martin Urbanec (WMF)'` (T354220)
22:29 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2087.codfw.wmnet with OS bullseye
21:08 bking@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2087.codfw.wmnet with OS bullseye
20:52 urbanecm: mwmaint2002: `mwscript extensions/GrowthExperiments/maintenance/reassignMentees.php --wiki=enwiki --mentor 'FormalDude' --performer 'Martin Urbanec (WMF)'` (T354220)
20:32 mutante: phab2002 - synced /srv/homes tfrom phab1004 to /srv/homes on phab2002
19:39 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.12 refs T350088
18:29 mutante: confctl select 'name=mw2394.codfw.wmnet' set/pooled=inactive | T354193#9430654 - seems like 2396 was previously depooled instead of this 2394
17:29 dancy@deploy2002: Installation of scap version "4.65.1" completed for 566 hosts
17:28 dancy@deploy2002: Installing scap version "4.65.1" for 566 hosts
17:26 dancy@deploy2002: Installing scap version "4.65.1" for 567 hosts
14:59 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1008.eqiad.wmnet with OS bookworm
14:58 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1009.eqiad.wmnet with OS bookworm
14:44 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki=csbwiktionary --fix # T354114
14:43 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1009.eqiad.wmnet with reason: host reimage
14:40 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1009.eqiad.wmnet with reason: host reimage
14:37 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1008.eqiad.wmnet with reason: host reimage
14:34 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1008.eqiad.wmnet with reason: host reimage
14:32 _joe_: confctl select 'name=mw2396.codfw.wmnet' set/pooled=inactive
14:26 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host dbstore1009.eqiad.wmnet with OS bookworm
14:20 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host dbstore1008.eqiad.wmnet with OS bookworm
14:16 urbanecm@deploy2002: Finished scap: Backport for cswiki: Grant patrolmarks to autopatrolled (T354004), csbwiktionary: Set MetaNamespaceName to Wikisłowôrz (T354114) (duration: 13m 46s)
14:04 urbanecm@deploy2002: urbanecm: Continuing with sync
14:04 urbanecm@deploy2002: urbanecm: Backport for cswiki: Grant patrolmarks to autopatrolled (T354004), csbwiktionary: Set MetaNamespaceName to Wikisłowôrz (T354114) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
14:02 urbanecm@deploy2002: Started scap: Backport for cswiki: Grant patrolmarks to autopatrolled (T354004), csbwiktionary: Set MetaNamespaceName to Wikisłowôrz (T354114)
10:55 vgutierrez@cumin1002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp4044.ulsfo.wmnet,cp4050.ulsfo.wmnet} and A:cp
10:50 vgutierrez@cumin1002: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp4044.ulsfo.wmnet,cp4050.ulsfo.wmnet} and A:cp
10:38 vgutierrez: fetching haproxy 2.6.16 for thirdparty/haproxy26 bullseye-wikimedia (apt.wm.o)
09:23 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Commissioning new database server
09:23 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Commissioning new database server
09:17 pfischer@deploy2002: Finished scap: Backport for configure message_key_fields for update_pipeline (duration: 15m 35s)
09:05 pfischer@deploy2002: pfischer: Continuing with sync
09:04 pfischer@deploy2002: pfischer: Backport for configure message_key_fields for update_pipeline synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
09:02 moritzm: installing nodejs security updates on bookworm
09:02 pfischer@deploy2002: Started scap: Backport for configure message_key_fields for update_pipeline
08:33 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2448.mgmt.codfw.wmnet with reboot policy GRACEFUL
08:27 jayme: restart prometheus@k8s prometheus@k8s-aux in eqiad - T343529
08:26 akosiaris@cumin1001: START - Cookbook sre.hosts.provision for host mw2448.mgmt.codfw.wmnet with reboot policy GRACEFUL
06:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2144.codfw.wmnet with OS bookworm
06:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
06:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
06:06 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2144.codfw.wmnet with OS bookworm
05:00 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.12 refs T350088 (duration: 56m 48s)
04:03 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.12 refs T350088

2024-01-01

21:38 eileen: config revision changed from 026cf508 to 21b91455
21:13 eileen: config revision changed from 3a1a1444 to 026cf508
21:13 eileen: fork/mapping-edit-button-fix
17:11 joal@deploy2002: Finished deploy [airflow-dags/analytics@8b8a456]: Fix monthly job [airflow-dags/analytics@8b8a4567] (duration: 00m 31s)
17:11 joal@deploy2002: Started deploy [airflow-dags/analytics@8b8a456]: Fix monthly job [airflow-dags/analytics@8b8a4567]

Other archives

2000s

Archive 1: 2004 Jun - 2004 Sep
Archive 2: 2004 Oct - 2004 Nov
Archive 3: 2004 Dec - 2005 Mar
Archive 4: 2005 Apr - 2005 Jul
Archive 5: 2005 Aug - 2005 Oct, with revision history 2004-06-23 to 2005-11-25
Archive 6: 2005 Nov - 2006 Feb
Archive 7: 2006 Mar - 2006 Jun
Archive 8: 2006 Jul - 2006 Sep
Archive 9: 2006 Oct - 2007 Jan, with revision history 2005-11-25 to 2007-02-21
Archive 10: 2007 Feb - 2007 Jun
Archive 11: 2007 Jul - 2007 Dec
Archive 12: 2008 Jan - 2008 Jul
Archive 12a: 2008 Aug
Archive 12b: 2008 Sept
Archive 13: 2008 Oct - 2009 Jun
Archive 14: 2009 Jun - 2009 Dec

2010s

2020s