Jump to content

Server Admin Log

From Wikitech

2026-03-12

  • 02:16 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet with OS trixie
  • 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 14s)
  • 02:03 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+icu72+wmf11u1 into component/php83-icu72 - T419058
  • 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 01:59 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
  • 01:52 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
  • 01:49 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+icu72+wmf11u1 into component/php83-icu72 - T419058
  • 01:47 swfrench-wmf: reprepro include php-apcu_5.1.24-1+icu72+wmf11u1 into component/php83-icu72 - T419058
  • 01:37 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2005.codfw.wmnet with OS trixie
  • 01:36 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet with OS trixie
  • 01:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7012.*
  • 01:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
  • 01:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 01:15 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
  • 01:13 swfrench-wmf: reprepro include dh-php_5.5+icu72+wmf11u1 into component/php83-icu72 - T419058
  • 01:08 swfrench-wmf: reprepro include php-defaults_94+icu72+wmf11u1 into component/php83-icu72 - T419058
  • 01:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 01:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 01:03 swfrench-wmf: reprepro include php8.3_8.3.30-1+icu72+wmf11u1 into component/php83-icu72 - T419058
  • 01:00 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2004.codfw.wmnet with OS trixie
  • 00:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7012.magru.wmnet with OS trixie
  • 00:59 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
  • 00:58 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
  • 00:38 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
  • 00:38 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
  • 00:37 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
  • 00:37 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
  • 00:36 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
  • 00:36 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
  • 00:33 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
  • 00:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7012.magru.wmnet with reason: host reimage
  • 00:27 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: sync
  • 00:24 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7012.magru.wmnet with reason: host reimage
  • 00:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7012.magru.wmnet with OS trixie

2026-03-11

  • 23:56 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7009.*
  • 22:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
  • 22:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
  • 22:45 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 22:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 22:29 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 22:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 22:27 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7009.magru.wmnet with OS trixie
  • 21:56 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
  • 21:55 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
  • 21:54 jforrester@deploy2002: Finished scap sync-world: Backport for OrchestratorRequest: Switch evaluations to v2 endpoint (T413727) (duration: 18m 19s)
  • 21:47 jforrester@deploy2002: jforrester: Continuing with sync
  • 21:43 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 21:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7009.magru.wmnet with reason: host reimage
  • 21:42 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 21:40 jforrester@deploy2002: jforrester: Backport for OrchestratorRequest: Switch evaluations to v2 endpoint (T413727) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:39 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7009.magru.wmnet with reason: host reimage
  • 21:35 jforrester@deploy2002: Started scap sync-world: Backport for OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)
  • 21:30 rzl: rzl@apt1002:~$ sudo -i reprepro -C component/envoy-future include bullseye-wikimedia /home/rzl/envoyproxy_1.35.9-1_amd64.changes
  • 21:29 arlolra@deploy2002: Finished scap sync-world: Backport for Show category index when no category selected on Special:LintTemplateErrors (T417363), Show category index when no category selected on Special:LintTemplateErrors (T417363) (duration: 35m 16s)
  • 21:16 arlolra@deploy2002: arlolra: Continuing with sync
  • 21:15 arlolra@deploy2002: arlolra: Backport for Show category index when no category selected on Special:LintTemplateErrors (T417363), Show category index when no category selected on Special:LintTemplateErrors (T417363) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:08 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7009.magru.wmnet with OS trixie
  • 21:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7010.*
  • 21:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7010.magru.wmnet with OS trixie
  • 20:54 arlolra@deploy2002: Started scap sync-world: Backport for Show category index when no category selected on Special:LintTemplateErrors (T417363), Show category index when no category selected on Special:LintTemplateErrors (T417363)
  • 20:47 jsn@deploy2002: Finished scap sync-world: Backport for urwikisource: add logo, sitename and projectnamespace (T415974) (duration: 06m 55s)
  • 20:43 jsn@deploy2002: anzx, jsn: Continuing with sync
  • 20:42 jsn@deploy2002: anzx, jsn: Backport for urwikisource: add logo, sitename and projectnamespace (T415974) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:40 jsn@deploy2002: Started scap sync-world: Backport for urwikisource: add logo, sitename and projectnamespace (T415974)
  • 20:38 jsn@deploy2002: Finished scap sync-world: Backport for riskyArticleEdits: show page descriptions (T419442), Fix Instrumentation on mobile view (T419517), ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570) (duration: 10m 37s)
  • 20:38 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ml-serve1014.eqiad.wmnet with reason: T400626
  • 20:37 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7010.magru.wmnet with reason: host reimage
  • 20:34 jsn@deploy2002: jsn, sfaci: Continuing with sync
  • 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search-test: apply
  • 20:33 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search-test: apply
  • 20:32 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7010.magru.wmnet with reason: host reimage
  • 20:30 jsn@deploy2002: jsn, sfaci: Backport for riskyArticleEdits: show page descriptions (T419442), Fix Instrumentation on mobile view (T419517), ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:28 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on gitlab1003.wikimedia.org with reason: Upgrade
  • 20:28 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on gitlab2002.wikimedia.org with reason: Upgrade
  • 20:27 jsn@deploy2002: Started scap sync-world: Backport for riskyArticleEdits: show page descriptions (T419442), Fix Instrumentation on mobile view (T419517), ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)
  • 20:21 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:18 andrew@cumin2002: START - Cookbook sre.dns.netbox
  • 20:17 bvibber@deploy2002: Finished scap sync-world: Backport for Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721), Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721) (duration: 06m 47s)
  • 20:13 bvibber@deploy2002: bvibber: Continuing with sync
  • 20:12 bvibber@deploy2002: bvibber: Backport for Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721), Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:10 bvibber@deploy2002: Started scap sync-world: Backport for Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721), Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)
  • 19:59 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7010.magru.wmnet with OS trixie
  • 19:54 andrew@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 19:51 andrew@cumin2002: START - Cookbook sre.dns.netbox
  • 19:37 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-backup1004.eqiad.wmnet with OS trixie
  • 19:01 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp7011.magru.wmnet
  • 19:01 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7011.magru.wmnet
  • 18:56 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
  • 18:49 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.19 refs T413810
  • 18:49 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 18:49 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 18:45 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 18:45 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 18:44 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 18:44 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 18:43 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
  • 18:42 brennen: 1.46.0-wmf.19 train status: no current blockers, going ahead to group1.
  • 18:39 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2332.codfw.wmnet
  • 18:37 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2332.codfw.wmnet
  • 18:20 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.*
  • 18:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
  • 18:16 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-backup1004.eqiad.wmnet with OS trixie
  • 18:13 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
  • 17:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
  • 17:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1010.eqiad.wmnet with reason: host reimage
  • 17:52 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
  • 17:52 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
  • 17:48 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
  • 17:47 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1010.eqiad.wmnet with reason: host reimage
  • 17:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
  • 17:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
  • 17:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
  • 17:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
  • 17:38 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 17:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 17:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 17:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 17:36 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 17:36 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 17:35 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
  • 17:34 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
  • 17:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 17:31 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 17:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 17:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 17:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 17:19 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 17:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
  • 17:19 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 17:18 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 17:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
  • 17:13 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 17:12 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 17:09 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 17:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 17:02 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7011.magru.wmnet with OS trixie
  • 17:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
  • 17:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
  • 16:58 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum4004.ulsfo.wmnet with reason: in setup
  • 16:58 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum4003.ulsfo.wmnet with reason: in setup
  • 16:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
  • 16:40 root@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:40 root@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moving many things from cloudgw2002-dev to cloudgw2004-dev - root@cumin2002"
  • 16:40 root@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moving many things from cloudgw2002-dev to cloudgw2004-dev - root@cumin2002"
  • 16:39 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
  • 16:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
  • 16:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7011.magru.wmnet with reason: host reimage
  • 16:35 root@cumin2002: START - Cookbook sre.dns.netbox
  • 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus4002.ulsfo.wmnet
  • 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - tappof@cumin1003"
  • 16:30 tappof@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - tappof@cumin1003"
  • 16:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7011.magru.wmnet with reason: host reimage
  • 16:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
  • 16:23 tappof@cumin1003: START - Cookbook sre.dns.netbox
  • 16:18 tappof@cumin1003: START - Cookbook sre.hosts.decommission for hosts prometheus4002.ulsfo.wmnet
  • 15:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 15:52 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7011.magru.wmnet with OS trixie
  • 15:51 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - T419712
  • 15:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 15:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 15:50 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 15:49 urbanecm@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
  • 15:48 sukhe: sudo cumin -b1 -s10 "C:dnsrecursor" "run-puppet-agent --enable 'merging CR 1250576'"
  • 15:48 urbanecm@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
  • 15:46 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 15:45 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
  • 15:43 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - T419712
  • 15:39 sukhe: sudo cumin "C:dnsrecursor" "disable-puppet 'merging CR 1250576'"
  • 15:35 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - T419712
  • 15:26 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - T419712
  • 15:08 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
  • 15:08 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
  • 15:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:53 swfrench-wmf: updated component/php83-icu72 with libpcre2 10.42-1~wmf11+1 from apt-staging - T419058
  • 14:46 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
  • 14:45 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
  • 14:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4004.ulsfo.wmnet
  • 14:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4004.ulsfo.wmnet with OS trixie
  • 14:39 vgutierrez: depool ncredir4003 && ncredir4004
  • 14:38 vgutierrez: repool ncredir4001 && ncredir4002
  • 14:31 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4002.ulsfo.wmnet
  • 14:31 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4001.ulsfo.wmnet
  • 14:30 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4004.ulsfo.wmnet
  • 14:30 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=ncredir4004.ulsfo.wmnet
  • 14:27 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4003.ulsfo.wmnet
  • 14:27 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=ncredir4003.ulsfo.wmnet
  • 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4004.ulsfo.wmnet with reason: host reimage
  • 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 14:19 moritzm: installing python-urllib3 security updates
  • 14:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4004.ulsfo.wmnet with reason: host reimage
  • 14:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
  • 14:13 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:12 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 14:12 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:12 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 14:12 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:11 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:11 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 14:11 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 14:11 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:10 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:08 gkyziridis@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
  • 14:08 gkyziridis@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
  • 14:07 jdlrobson@deploy2002: Finished scap sync-world: Backport for Fix pinnableElement export (T419620) (duration: 06m 26s)
  • 14:06 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:05 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:04 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:04 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:03 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:03 jdlrobson@deploy2002: jdlrobson: Continuing with sync
  • 14:03 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:02 jdlrobson@deploy2002: jdlrobson: Backport for Fix pinnableElement export (T419620) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:00 jdlrobson@deploy2002: Started scap sync-world: Backport for Fix pinnableElement export (T419620)
  • 13:58 moritzm: uploaded libxml2 2.9.10+dfsg-6.7+deb11u9+wmf11u1 to component/php83-icu72 for bullseye-wikimedia (special build of libxml with ICU disabled to ensure co-installabiliy between icu 67 and icu 72) T419058
  • 13:57 jdlrobson@deploy2002: Finished scap sync-world: Backport for Restore advanced main menu for AMC (T413912) (duration: 10m 44s)
  • 13:55 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum4004.ulsfo.wmnet with OS trixie
  • 13:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 13:54 vgutierrez: repool cp7016
  • 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4004.ulsfo.wmnet - jmm@cumin2002"
  • 13:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4004.ulsfo.wmnet - jmm@cumin2002"
  • 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum4004.ulsfo.wmnet on all recursors
  • 13:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum4004.ulsfo.wmnet on all recursors
  • 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4004.ulsfo.wmnet - jmm@cumin2002"
  • 13:51 jdlrobson@deploy2002: jdlrobson: Continuing with sync
  • 13:50 jdlrobson@deploy2002: jdlrobson: Backport for Restore advanced main menu for AMC (T413912) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:49 vgutierrez: depool cp7016
  • 13:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4004.ulsfo.wmnet - jmm@cumin2002"
  • 13:46 jdlrobson@deploy2002: Started scap sync-world: Backport for Restore advanced main menu for AMC (T413912)
  • 13:45 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 13:44 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 13:44 jdlrobson@deploy2002: Finished scap sync-world: Backport for Remove `MetricsPlatform` configuration from production (T416865) (duration: 35m 52s)
  • 13:43 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 13:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
  • 13:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
  • 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4004.ulsfo.wmnet with OS bookworm
  • 13:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum4004.ulsfo.wmnet
  • 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4003.ulsfo.wmnet
  • 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4003.ulsfo.wmnet with OS trixie
  • 13:36 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 13:35 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 13:30 jdlrobson@deploy2002: jdlrobson, sfaci: Continuing with sync
  • 13:29 jdlrobson@deploy2002: jdlrobson, sfaci: Backport for Remove `MetricsPlatform` configuration from production (T416865) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
  • 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4003.ulsfo.wmnet with reason: host reimage
  • 13:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
  • 13:13 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4003.ulsfo.wmnet with reason: host reimage
  • 13:08 jdlrobson@deploy2002: Started scap sync-world: Backport for Remove `MetricsPlatform` configuration from production (T416865)
  • 13:00 moritzm: installing libcommons-lang3-java security updates
  • 12:57 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4004.ulsfo.wmnet with OS bookworm
  • 12:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4003.ulsfo.wmnet with OS bookworm
  • 12:46 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum4003.ulsfo.wmnet with OS trixie
  • 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4003.ulsfo.wmnet - jmm@cumin2002"
  • 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4003.ulsfo.wmnet - jmm@cumin2002"
  • 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum4003.ulsfo.wmnet on all recursors
  • 12:45 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum4003.ulsfo.wmnet on all recursors
  • 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4003.ulsfo.wmnet - jmm@cumin2002"
  • 12:41 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4003.ulsfo.wmnet - jmm@cumin2002"
  • 12:37 moritzm: installing inetutils security updates
  • 12:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum4003.ulsfo.wmnet
  • 12:35 tappof: completed migration from prometheus4002 to prometheus4003 (ulsfo) (TT419430)
  • 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
  • 12:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
  • 12:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
  • 12:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2073.codfw.wmnet with OS bullseye
  • 12:23 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
  • 12:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
  • 12:18 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
  • 12:17 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1011
  • 12:17 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1011
  • 12:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2072.codfw.wmnet with OS bullseye
  • 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
  • 12:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
  • 12:04 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4003.ulsfo.wmnet with OS bookworm
  • 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
  • 11:59 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
  • 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
  • 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
  • 11:48 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
  • 11:41 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] Enable on every new Wikipedia by default (T304052) (duration: 06m 39s)
  • 11:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2073
  • 11:38 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2073
  • 11:37 vgutierrez: upgrading to acme-chief 0.39 on acme-chief production instances - T419352
  • 11:37 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 11:36 urbanecm@deploy2002: urbanecm: Backport for [Growth] Enable on every new Wikipedia by default (T304052) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 11:36 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2073
  • 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2073.codfw.wmnet 212.48.192.10.in-addr.arpa 2.1.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:36 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2073.codfw.wmnet 212.48.192.10.in-addr.arpa 2.1.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2073 - mvernon@cumin2002"
  • 11:36 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2073 - mvernon@cumin2002"
  • 11:35 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 11:34 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 11:34 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] Enable on every new Wikipedia by default (T304052)
  • 11:34 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 11:34 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] kaiwiki: Enable GrowthExperiments (T304052) (duration: 14m 11s)
  • 11:33 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 11:33 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 11:32 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 11:32 mvernon@cumin2002: START - Cookbook sre.dns.netbox
  • 11:31 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2073
  • 11:30 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2073.codfw.wmnet with OS bullseye
  • 11:30 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 11:29 cgoubert@dns1004: END - running authdns-update
  • 11:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2072
  • 11:29 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2072
  • 11:28 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2072
  • 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2072.codfw.wmnet 158.32.192.10.in-addr.arpa 8.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:28 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2072.codfw.wmnet 158.32.192.10.in-addr.arpa 8.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2072 - mvernon@cumin2002"
  • 11:28 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2072 - mvernon@cumin2002"
  • 11:28 cgoubert@dns1004: START - running authdns-update
  • 11:26 urbanecm@deploy2002: mwscript-k8s job started: WikimediaMaintenance:createExtensionTables.php --wiki=kaiwiki growthexperiments # T304052
  • 11:24 mvernon@cumin2002: START - Cookbook sre.dns.netbox
  • 11:24 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2072
  • 11:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2072.codfw.wmnet with OS bullseye
  • 11:22 tappof@dns1004: END - running authdns-update
  • 11:22 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 11:21 urbanecm@deploy2002: urbanecm: Backport for [Growth] kaiwiki: Enable GrowthExperiments (T304052) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 11:21 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 11:21 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 11:21 tappof@dns1004: START - running authdns-update
  • 11:21 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 11:19 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] kaiwiki: Enable GrowthExperiments (T304052)
  • 11:19 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
  • 11:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2071.codfw.wmnet with OS bullseye
  • 11:18 urbanecm@deploy2002: mwscript-k8s job started: WikimediaMaintenance:createExtensionTables.php --wiki=kaiwiki growthexperiments # T304052
  • 11:10 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 11:10 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 11:08 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 11:08 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 11:05 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 11:05 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 10:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
  • 10:54 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
  • 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2071
  • 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2071
  • 10:34 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2071
  • 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2071.codfw.wmnet 221.16.192.10.in-addr.arpa 1.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:34 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2071.codfw.wmnet 221.16.192.10.in-addr.arpa 1.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2071 - mvernon@cumin2002"
  • 10:34 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2071 - mvernon@cumin2002"
  • 10:26 mvernon@cumin2002: START - Cookbook sre.dns.netbox
  • 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2071
  • 10:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2071.codfw.wmnet with OS bullseye
  • 10:08 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2095.codfw.wmnet with OS bullseye
  • 10:03 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Failed step after ml-serve1015's reimage - elukey@cumin1003"
  • 10:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Failed step after ml-serve1015's reimage - elukey@cumin1003"
  • 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1015.eqiad.wmnet with OS trixie
  • 10:01 elukey@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
  • 09:59 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2096.codfw.wmnet with OS bullseye
  • 09:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2096.codfw.wmnet with OS bullseye
  • 09:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
  • 09:51 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
  • 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2095.codfw.wmnet with OS bullseye
  • 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
  • 09:46 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
  • 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
  • 09:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
  • 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
  • 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
  • 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 09:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 09:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
  • 09:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
  • 09:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
  • 09:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
  • 09:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 09:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 09:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 09:28 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
  • 09:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 09:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
  • 09:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 09:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
  • 09:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
  • 09:24 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
  • 09:22 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
  • 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir4004.ulsfo.wmnet
  • 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4004.ulsfo.wmnet with OS bookworm
  • 09:15 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:15 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:14 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
  • 09:10 javiermonton@deploy2002: Finished scap sync-world: Backport for stream: mediawiki.page_html_content_change (T419258) (duration: 08m 28s)
  • 09:07 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2096.codfw.wmnet with OS bullseye
  • 09:06 javiermonton@deploy2002: javiermonton: Continuing with sync
  • 09:03 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
  • 09:03 javiermonton@deploy2002: javiermonton: Backport for stream: mediawiki.page_html_content_change (T419258) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
  • 09:01 javiermonton@deploy2002: Started scap sync-world: Backport for stream: mediawiki.page_html_content_change (T419258)
  • 08:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1015.eqiad.wmnet with reason: host reimage
  • 08:58 trueg@deploy2002: helmfile [staging] DONE helmfile.d/services/SERVICE_NAME: apply
  • 08:58 trueg@deploy2002: helmfile [staging] START helmfile.d/services/SERVICE_NAME: apply
  • 08:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
  • 08:55 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2239.codfw.wmnet with reason: mysql upgrade / restart
  • 08:54 moritzm: installing imagemagick security updates
  • 08:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1015.eqiad.wmnet with reason: host reimage
  • 08:41 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1015.eqiad.wmnet with OS trixie
  • 08:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1014.eqiad.wmnet with OS trixie
  • 08:40 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
  • 08:39 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
  • 08:35 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4004.ulsfo.wmnet with OS bookworm
  • 08:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
  • 08:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
  • 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir4004.ulsfo.wmnet on all recursors
  • 08:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir4004.ulsfo.wmnet on all recursors
  • 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
  • 08:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
  • 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1014.eqiad.wmnet with reason: host reimage
  • 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 08:21 Msz2001: UTC morning backport window finished
  • 08:21 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 08:21 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir4004.ulsfo.wmnet
  • 08:21 mszwarc@deploy2002: Finished scap sync-world: Backport for Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages (duration: 10m 46s)
  • 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir4003.ulsfo.wmnet
  • 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4003.ulsfo.wmnet with OS bookworm
  • 08:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1014.eqiad.wmnet with reason: host reimage
  • 08:15 mszwarc@deploy2002: mszwarc: Continuing with sync
  • 08:14 mszwarc@deploy2002: mszwarc: Backport for Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:10 mszwarc@deploy2002: Started scap sync-world: Backport for Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages
  • 08:09 mszwarc@deploy2002: Finished scap sync-world: Backport for Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422), Send2FAWarningNotifications: Support reading users from file (T419111) (duration: 33m 07s)
  • 08:05 moritzm: installing mariadb bugfix updates from Bookworm point release (tools and libraries as packaged in Debian, unrelated to the wmf-mariadb packages)
  • 08:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1014.eqiad.wmnet with OS trixie
  • 08:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
  • 07:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
  • 07:57 mszwarc@deploy2002: mszwarc: Continuing with sync
  • 07:56 mszwarc@deploy2002: mszwarc: Backport for Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422), Send2FAWarningNotifications: Support reading users from file (T419111) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1049.eqiad.wmnet
  • 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
  • 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
  • 07:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
  • 07:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4003.ulsfo.wmnet with OS bookworm
  • 07:36 mszwarc@deploy2002: Started scap sync-world: Backport for Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422), Send2FAWarningNotifications: Support reading users from file (T419111)
  • 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
  • 07:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
  • 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir4003.ulsfo.wmnet on all recursors
  • 07:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir4003.ulsfo.wmnet on all recursors
  • 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
  • 07:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
  • 07:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir4003.ulsfo.wmnet
  • 07:22 kgraessle@deploy2002: Finished scap sync-world: Backport for Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727) (duration: 12m 24s)
  • 07:18 kgraessle@deploy2002: kgraessle: Continuing with sync
  • 07:12 kgraessle@deploy2002: kgraessle: Backport for Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:09 kgraessle@deploy2002: Started scap sync-world: Backport for Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)
  • 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 59s)
  • 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 00:33 zabe@deploy2002: Finished scap sync-world: Backport for Stop setting $wgImageLinksSchemaMigrationStage (T299953) (duration: 09m 38s)
  • 00:29 zabe@deploy2002: zabe: Continuing with sync
  • 00:26 zabe@deploy2002: zabe: Backport for Stop setting $wgImageLinksSchemaMigrationStage (T299953) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 00:24 zabe@deploy2002: Started scap sync-world: Backport for Stop setting $wgImageLinksSchemaMigrationStage (T299953)
  • 00:03 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
  • 00:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint1003.wikimedia.org with OS trixie
  • 00:03 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
  • 00:03 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"

2026-03-10

  • 23:58 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
  • 23:53 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
  • 23:49 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
  • 23:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint1003.wikimedia.org with reason: host reimage
  • 23:40 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on contint1003.wikimedia.org with reason: host reimage
  • 23:31 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2096.codfw.wmnet with OS bullseye
  • 23:31 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
  • 23:26 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2095.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2096.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 23:22 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
  • 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 23:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 23:11 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2096.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 23:05 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 23:05 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 22:59 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2095.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 22:39 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 22:38 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 21:51 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7012.magru.wmnet with OS trixie
  • 21:48 Dreamy_Jazz: Evening UTC backport window done
  • 21:42 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7006.magru.wmnet [reason: trixie reimaging]
  • 21:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7006.magru.wmnet with OS trixie
  • 21:25 tgr@deploy2002: Finished scap sync-world: Backport for Migrate EmailAuth, step 2 (T404334) (duration: 25m 34s)
  • 21:24 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7007.magru.wmnet [reason: trixie reimaging]
  • 21:22 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7007.magru.wmnet with OS trixie
  • 21:21 tgr@deploy2002: tgr: Continuing with sync
  • 21:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
  • 21:09 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
  • 21:02 tgr@deploy2002: tgr: Backport for Migrate EmailAuth, step 2 (T404334) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:00 tgr@deploy2002: Started scap sync-world: Backport for Migrate EmailAuth, step 2 (T404334)
  • 20:59 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7012.magru.wmnet with OS trixie
  • 20:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7007.magru.wmnet with reason: host reimage
  • {{safesubst:SAL entry|1=20:50 jforrester@deploy2002: Finished scap sync-world: Backport for Deploy participant recruitment survey on ptwiki and trwiki (T419275), wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402), wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403), [[gerrit:1249393|build: Upgrade mediawiki-phan-config from 0.18.0 to 0.2}}
  • 20:48 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7007.magru.wmnet with reason: host reimage
  • 20:46 jforrester@deploy2002: dani, jforrester: Continuing with sync
  • {{safesubst:SAL entry|1=20:45 jforrester@deploy2002: dani, jforrester: Backport for Deploy participant recruitment survey on ptwiki and trwiki (T419275), wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402), wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403), [[gerrit:1249393|build: Upgrade mediawiki-phan-config from 0.18.0 to 0.20.0 (T41}}
  • 20:43 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7006.magru.wmnet with OS trixie
  • {{safesubst:SAL entry|1=20:43 jforrester@deploy2002: Started scap sync-world: Backport for Deploy participant recruitment survey on ptwiki and trwiki (T419275), wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402), wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403), [[gerrit:1249393|build: Upgrade mediawiki-phan-config from 0.18.0 to 0.20}}
  • 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7006.magru.wmnet with OS trixie
  • 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cdobbins@cumin2002"
  • 20:38 jforrester@deploy2002: Finished scap sync-world: Backport for Enable personal main menu to all users in Minerva Neue skin (T413912), Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592), Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439) (duration: 12m 58s)
  • 20:36 cdobbins@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cdobbins@cumin2002"
  • 20:34 jforrester@deploy2002: jforrester, cscott, bwang: Continuing with sync
  • 20:27 jforrester@deploy2002: jforrester, cscott, bwang: Backport for Enable personal main menu to all users in Minerva Neue skin (T413912), Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592), Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439) synced to the testservers (see https://wikitech.wi
  • 20:25 jforrester@deploy2002: Started scap sync-world: Backport for Enable personal main menu to all users in Minerva Neue skin (T413912), Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592), Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)
  • 20:25 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7007.magru.wmnet with OS trixie
  • 20:24 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7007.magru.wmnet [reason: trixie reimaging]
  • 20:24 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7005.magru.wmnet [reason: trixie reimaging]
  • 20:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7005.magru.wmnet with OS trixie
  • 20:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
  • 20:03 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7013.*
  • 20:03 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
  • 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7013.magru.wmnet with OS trixie
  • 19:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7005.magru.wmnet with reason: host reimage
  • 19:42 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7005.magru.wmnet with reason: host reimage
  • 19:40 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7006.magru.wmnet with OS trixie
  • 19:40 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7006.magru.wmnet [reason: trixie reimaging]
  • 19:39 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7004.magru.wmnet [reason: trixie reimaging]
  • 19:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7013.magru.wmnet with reason: host reimage
  • 19:19 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7005.magru.wmnet with OS trixie
  • 19:19 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7004.magru.wmnet with OS trixie
  • 19:19 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7005.magru.wmnet [reason: trixie reimaging]
  • 19:18 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7013.magru.wmnet with reason: host reimage
  • 19:17 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
  • 19:16 brennen@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.19 refs T413810
  • 19:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7003.magru.wmnet with OS trixie
  • 19:09 brennen: 1.46.0-wmf.19 train status: blockers believed resolved, rolling to group0
  • 19:07 brennen@deploy2002: Finished scap sync-world: Backport for Re-add correct namespace for translatable pages (T419294) (duration: 12m 30s)
  • 19:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
  • 19:01 brennen@deploy2002: abi, brennen: Continuing with sync
  • 18:58 brennen@deploy2002: abi, brennen: Backport for Re-add correct namespace for translatable pages (T419294) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:55 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7013.magru.wmnet with OS trixie
  • 18:54 brennen@deploy2002: Started scap sync-world: Backport for Re-add correct namespace for translatable pages (T419294)
  • 18:52 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7004.magru.wmnet with reason: host reimage
  • 18:52 brennen@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.19 refs T413810 (duration: 38m 34s)
  • 18:49 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7004.magru.wmnet with reason: host reimage
  • 18:47 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7003.magru.wmnet with reason: host reimage
  • 18:44 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7003.magru.wmnet with reason: host reimage
  • 18:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7015.*
  • 18:27 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7015.magru.wmnet with OS trixie
  • 18:23 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7004.magru.wmnet with OS trixie
  • 18:21 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7004.magru.wmnet [reason: trixie reimaging]
  • 18:16 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7003.magru.wmnet with OS trixie
  • 18:13 brennen@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.19 refs T413810
  • 18:13 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
  • 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
  • 18:00 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 17:59 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 17:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7015.magru.wmnet with reason: host reimage
  • 17:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 17:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 17:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7015.magru.wmnet with reason: host reimage
  • 17:54 hashar@deploy2002: Finished deploy [integration/docroot@f544f49]: Catch up with composer/npm dev dependencies. Noop for production (duration: 00m 11s)
  • 17:54 hashar@deploy2002: Started deploy [integration/docroot@f544f49]: Catch up with composer/npm dev dependencies. Noop for production
  • 17:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 17:43 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 17:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 17:31 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 17:30 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7015.magru.wmnet with OS trixie
  • 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 17:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 17:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 17:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
  • 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 17:26 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:23 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
  • 17:22 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 17:12 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 17:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 17:11 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 17:11 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 17:09 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 17:09 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 17:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
  • 16:40 andrew@dns1004: END - running authdns-update
  • 16:38 andrew@dns1004: START - running authdns-update
  • 16:25 reedy@deploy2002: Finished scap sync-world: Backport for Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled" (duration: 07m 45s)
  • 16:21 reedy@deploy2002: reedy: Continuing with sync
  • 16:19 reedy@deploy2002: reedy: Backport for Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 16:17 reedy@deploy2002: Started scap sync-world: Backport for Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"
  • 15:59 jynus@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
  • 15:59 taavi: update cr firewall policy for codfw1dev ldap tree https://gerrit.wikimedia.org/r/1249985
  • 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-fr-tech: apply
  • 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-fr-tech: apply
  • 15:55 jynus@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
  • 15:48 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 15:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:28 brouberol@dns1004: END - running authdns-update
  • 15:27 brouberol@dns1004: START - running authdns-update
  • 15:10 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002"
  • 15:10 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002
  • 15:09 swfrench@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002
  • 15:09 swfrench@cumin2002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002"
  • 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:58 sukhe: sudo cumin -b1 -s15 "C:bird" "run-puppet-agent --enable 'merging CR 1238007; add function return type'"
  • 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:58 sukhe: sudo cumin -b1 -s15 "C:bird" "run-puppet-agent 'merging CR 1238007; add function return type'"
  • 14:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:42 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1238007; add function return type'"
  • 14:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1014
  • 14:39 elukey@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:36 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1014
  • 14:36 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.powercycle (exit_code=99) for host ml-serve1014
  • 14:36 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1014
  • 14:12 otto@deploy2002: Finished scap sync-world: Backport for stream: mediawiki.page_edit_type_simple.dev0 (T351225) (duration: 11m 05s)
  • 14:08 otto@deploy2002: akhatun, otto: Continuing with sync
  • 14:02 otto@deploy2002: akhatun, otto: Backport for stream: mediawiki.page_edit_type_simple.dev0 (T351225) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:01 otto@deploy2002: Started scap sync-world: Backport for stream: mediawiki.page_edit_type_simple.dev0 (T351225)
  • 13:49 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
  • 13:43 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
  • 13:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
  • 13:28 vgutierrez: testing acme-chief 0.39 in acmechief-test2001 - T419352
  • 13:27 vgutierrez: upload acme-chief 0.39 to bookworm-wikimedia (apt.wm.o) - T419352
  • 13:16 jiji@cumin1003: END (FAIL) - Cookbook sre.memcached.roll-reboot-restart (exit_code=1) rolling restart_daemons on A:memcached-canary
  • 13:16 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling restart_daemons on A:memcached-canary
  • 13:12 mszwarc@deploy2002: Finished scap sync-world: Backport for Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580), kaiwiki: add logo, stiename, projectnamespace and timezone (T414237) (duration: 08m 45s)
  • 13:08 mszwarc@deploy2002: mszwarc, anzx: Continuing with sync
  • 13:05 mszwarc@deploy2002: mszwarc, anzx: Backport for Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580), kaiwiki: add logo, stiename, projectnamespace and timezone (T414237) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:03 mszwarc@deploy2002: Started scap sync-world: Backport for Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580), kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)
  • 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
  • 12:57 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1015.eqiad.wmnet with OS bookworm
  • 12:56 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
  • 12:51 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1014.eqiad.wmnet with OS bookworm
  • 12:50 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ml-serve1014
  • 12:50 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ml-serve1014
  • 12:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:49 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:49 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:47 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:45 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:44 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:42 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling restart_daemons on A:memcached-canary
  • 12:42 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling restart_daemons on A:memcached-canary
  • 12:31 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 12:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 12:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe2024.codfw.wmnet with OS bullseye
  • 11:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1003"
  • 11:17 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1003"
  • 11:15 Emperor: rebalance codfw swift rings T354872
  • 10:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe2024.codfw.wmnet with reason: host reimage
  • 10:47 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe2024.codfw.wmnet with reason: host reimage
  • 10:31 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye
  • 10:30 ayounsi@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-fe2024.codfw.wmnet with OS bullseye
  • 10:20 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye
  • 10:17 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
  • 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-eqdfw
  • 09:31 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device cr2-eqdfw
  • 09:22 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=loginwiki --logwiki=metawiki TMPRI1975 FondueFanatic # T419499
  • 09:00 arnaudb@dns1005: END - running authdns-update
  • 09:00 godog: restore all host interfaces - T417393
  • 08:58 arnaudb@dns1005: START - running authdns-update
  • 08:30 godog: disabled interface for cloudcephmon1004 - T417393
  • 08:22 godog: disabled interfaces for cloudcephosd1021 cloudcephosd1042 cloudcephosd1043 cloudcephosd1018 cloudcephosd1022 - T417393
  • 08:18 godog: disabled interfaces for cloudcephosd1016 cloudcephosd1017 cloudcephosd1016 cloudcephosd1018 cloudcephosd1017 cloudcephosd1035 - T417393
  • 08:05 godog: start disabling cloudcephosd interfaces - T417393
  • 07:49 godog: prep cloudsw reboot tests 'ceph osd set noout' - T417393
  • 07:41 filippo@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 19 hosts with reason: switch down tests
  • 06:14 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs2009.codfw.wmnet with OS bookworm
  • 04:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo
  • 04:08 pt1979@cumin2002: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
  • 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.16 (duration: 01m 48s)
  • 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 10s)
  • 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 01:37 ryankemper: [WDQS] T410573 repooled wdqs1011.eqiad.wmnet - erroneously depooled since `2025-11-19` by failed `sre.wdqs.reboot` cookbook
  • 00:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
  • 00:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 00:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED

2026-03-09

  • 22:51 rzl: root@apt1002:~# reprepro --noskipold --restrict vopsbot update bookworm-wikimedia
  • 22:34 bking@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1001.eqiad.wmnet
  • 22:32 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1001.eqiad.wmnet
  • 22:30 bking@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1002.eqiad.wmnet
  • 22:29 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
  • 22:28 bking@cumin2002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet
  • 22:28 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
  • 22:28 bking@cumin2002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet
  • 22:28 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
  • 22:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw2004-dev.codfw.wmnet with OS trixie
  • 22:02 alexsanford: Redeployed security fix for T419186
  • 21:44 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
  • 21:40 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
  • 21:37 cdobbins@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7002.magru.wmnet
  • 21:34 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7002.magru.wmnet with OS trixie
  • 21:29 alexsanford: Deployed security fix for T419186
  • 21:22 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie
  • 21:21 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudgw2004-dev.codfw.wmnet with OS trixie
  • 21:17 dani@deploy2002: Finished scap sync-world: Backport for Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275) (duration: 08m 15s)
  • 21:13 dani@deploy2002: dani: Continuing with sync
  • 21:11 dani@deploy2002: dani: Backport for Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:09 dani@deploy2002: Started scap sync-world: Backport for Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)
  • 21:08 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
  • 21:05 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cp7002.magru.wmnet with reason: host reimage
  • 21:02 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7002.magru.wmnet with reason: host reimage
  • 21:01 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
  • 21:01 tgr_: removed private code for T397244
  • 21:01 ryankemper: [WDQS] Alright, these are re-entering a failed state soon enough that we will need to identify the offender if we want to restore proper service. We could put some temporary hack to restart every few minutes so we at least maintain some uptime, but root cause is the usual 'we need a requestctl rule to block whoever's killing us' scenario
  • 21:00 cdobbins@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7001.magru.wmnet [reason: Trixie reimaging]
  • 20:57 ryankemper: [WDQS] Auto-remediation would have eventually restarted these, but some of them were staying below our current threshold of `threads > 1200`. May want to lower threshold, or examine an additional metric-type to look at in the future
  • 20:56 ryankemper: [WDQS] `ryankemper@cumin2002:~$ sudo -E cumin 'A:wdqs-main AND P{wdqs1*}' 'systemctl restart wdqs-blazegraph'`
  • 20:54 ryankemper: [WDQS] `ryankemper@cumin2002:~$ sudo -E cumin 'A:wdqs-main AND P{wdqs2*}' 'systemctl restart wdqs-blazegraph'`
  • 20:44 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie
  • 20:43 tgr@deploy2002: Unlocked for deployment [MediaWiki]: working on private change (duration: 10m 10s)
  • 20:36 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7002.magru.wmnet with OS trixie
  • 20:33 tgr@deploy2002: Locking from deployment [MediaWiki]: working on private change
  • 20:31 tgr@deploy2002: Finished scap sync-world: Backport for Enable parser survey for opted-out users on German/French/Polish wikis (T414852), lift IP cap for womens month editathon (T419109) (duration: 13m 36s)
  • 20:27 tgr@deploy2002: cscott, tgr, anzx: Continuing with sync
  • 20:19 tgr@deploy2002: cscott, tgr, anzx: Backport for Enable parser survey for opted-out users on German/French/Polish wikis (T414852), lift IP cap for womens month editathon (T419109) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:17 tgr@deploy2002: Started scap sync-world: Backport for Enable parser survey for opted-out users on German/French/Polish wikis (T414852), lift IP cap for womens month editathon (T419109)
  • 20:13 aaron@deploy2002: Finished scap sync-world: Backport for Remove redundant math spec file from wwwportal (T418188) (duration: 06m 56s)
  • 20:09 aaron@deploy2002: aaron: Continuing with sync
  • 20:08 aaron@deploy2002: aaron: Backport for Remove redundant math spec file from wwwportal (T418188) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:06 aaron@deploy2002: Started scap sync-world: Backport for Remove redundant math spec file from wwwportal (T418188)
  • 20:01 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7016.*
  • 19:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7001.magru.wmnet with OS trixie
  • 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7016.magru.wmnet with OS trixie
  • 19:49 zabe@deploy2002: Finished scap sync-world: Backport for Stop writing to il_to on commonswiki (T415787) (duration: 06m 04s)
  • 19:45 zabe@deploy2002: zabe: Continuing with sync
  • 19:44 zabe@deploy2002: zabe: Backport for Stop writing to il_to on commonswiki (T415787) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 19:43 zabe@deploy2002: Started scap sync-world: Backport for Stop writing to il_to on commonswiki (T415787)
  • 19:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 19:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 19:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7001.magru.wmnet with reason: host reimage
  • 19:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7016.magru.wmnet with reason: host reimage
  • 19:23 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7001.magru.wmnet with reason: host reimage
  • 19:19 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7016.magru.wmnet with reason: host reimage
  • 19:15 cwhite@deploy2002: Finished deploy [performance/arc-lamp@aa8da8b]: Ie7e035 (duration: 00m 08s)
  • 19:15 cwhite@deploy2002: Started deploy [performance/arc-lamp@aa8da8b]: Ie7e035
  • 19:14 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 19:14 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 19:05 herron@deploy2002: Finished scap sync-world: Backport for udp2log: switch to new hosts (duration: 09m 38s)
  • 19:03 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 19:03 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 19:01 herron@deploy2002: herron: Continuing with sync
  • 19:00 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 19:00 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
  • 18:59 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
  • 18:59 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 18:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
  • 18:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
  • 18:57 herron@deploy2002: herron: Backport for udp2log: switch to new hosts synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7001.magru.wmnet with OS trixie
  • 18:55 herron@deploy2002: Started scap sync-world: Backport for udp2log: switch to new hosts
  • 18:55 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7016.magru.wmnet with OS trixie
  • 18:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 18:49 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 18:44 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 18:44 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 18:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 18:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 18:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 18:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 18:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 18:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 18:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 18:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 18:23 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 18:23 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
  • 18:23 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 18:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 18:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 18:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
  • 18:05 herron@deploy2002: Sync cancelled.
  • 18:04 herron@deploy2002: herron: Backport for Revert "udp2log: switch to new hosts" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:02 herron@deploy2002: Started scap sync-world: Backport for Revert "udp2log: switch to new hosts"
  • 18:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
  • 17:54 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 17:47 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 17:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 17:42 herron@deploy2002: Sync cancelled.
  • 17:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 17:39 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 17:38 mutante: contint1003 - unable to get uptime Caused by: Cumin execution failed (exit_code=2) [101/240] - attempted manual powercycle - Initializing Firmware Interfaces... blank screen T418544
  • 17:34 mutante: contint1003.mgmt - racadm serveraction powercycle T418544 - not reacting
  • 17:25 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 17:25 herron@deploy2002: herron: Backport for udp2log: switch to new hosts (T417002) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:23 herron@deploy2002: Started scap sync-world: Backport for udp2log: switch to new hosts (T417002)
  • 17:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 17:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netflow4003.ulsfo.wmnet
  • 17:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netflow4003.ulsfo.wmnet with OS bookworm
  • 17:13 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 17:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 17:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
  • 17:03 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
  • 17:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
  • 17:00 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis kaiwiki in section s5
  • 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow4003.ulsfo.wmnet with reason: host reimage
  • 16:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow4003.ulsfo.wmnet with reason: host reimage
  • 16:37 moritzm: installing gnupg security updates
  • 16:31 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host netflow4003.ulsfo.wmnet with OS bookworm
  • 16:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
  • 16:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
  • 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow4003.ulsfo.wmnet on all recursors
  • 16:30 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow4003.ulsfo.wmnet on all recursors
  • 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
  • 16:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
  • 16:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 16:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow4003.ulsfo.wmnet
  • 16:26 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host prometheus4003.ulsfo.wmnet with OS bookworm
  • 15:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus4003.ulsfo.wmnet with reason: host reimage
  • 15:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus4003.ulsfo.wmnet with reason: host reimage
  • 15:44 vgutierrez: vgutierrez@acmechief-test2001:~$ sudo -i systemctl disable reload-acme-chief-backend.timer - T419352
  • 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
  • 15:37 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 15:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
  • 15:30 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
  • 15:26 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
  • 15:24 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 15:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
  • 15:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
  • 15:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
  • 15:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
  • 15:08 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
  • 15:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
  • 14:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bookworm
  • 14:49 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs2009.codfw.wmnet with OS bullseye
  • 14:45 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
  • 14:35 mszwarc@deploy2002: Finished scap sync-world: Backport for Hide 2fa-warning Echo category from preferences (T419111) (duration: 06m 07s)
  • 14:35 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis kaiwiki in section s5
  • 14:34 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitize-wiki (exit_code=99) Managing sanitization for wikis urwikisource in section s5
  • 14:31 mszwarc@deploy2002: mszwarc: Continuing with sync
  • 14:31 mszwarc@deploy2002: mszwarc: Backport for Hide 2fa-warning Echo category from preferences (T419111) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:30 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
  • 14:29 mszwarc@deploy2002: Started scap sync-world: Backport for Hide 2fa-warning Echo category from preferences (T419111)
  • 14:25 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
  • 14:22 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
  • 14:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
  • 14:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
  • 14:15 phuedx@deploy2002: Finished scap sync-world: Backport for JS SDK: Add getExperimentByPrefix() (T419191), ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191) (duration: 09m 39s)
  • 14:11 phuedx@deploy2002: phuedx: Continuing with sync
  • 14:07 phuedx@deploy2002: phuedx: Backport for JS SDK: Add getExperimentByPrefix() (T419191), ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:05 phuedx@deploy2002: Started scap sync-world: Backport for JS SDK: Add getExperimentByPrefix() (T419191), ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)
  • 14:03 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
  • 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
  • 13:54 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bullseye
  • 13:50 phuedx@deploy2002: Finished scap sync-world: Backport for Disable MetricsPlatform extension (T416865) (duration: 08m 02s)
  • 13:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
  • 13:46 phuedx@deploy2002: phuedx, sfaci: Continuing with sync
  • 13:44 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 13:44 phuedx@deploy2002: phuedx, sfaci: Backport for Disable MetricsPlatform extension (T416865) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:42 phuedx@deploy2002: Started scap sync-world: Backport for Disable MetricsPlatform extension (T416865)
  • 13:39 phuedx@deploy2002: Finished scap sync-world: Backport for Confirmemail: Log delay between email sent and confirmation (T415902), Enable confirmemail logstash channel (T415902) (duration: 11m 16s)
  • 13:35 phuedx@deploy2002: mmartorana, phuedx: Continuing with sync
  • 13:30 phuedx@deploy2002: mmartorana, phuedx: Backport for Confirmemail: Log delay between email sent and confirmation (T415902), Enable confirmemail logstash channel (T415902) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:28 phuedx@deploy2002: Started scap sync-world: Backport for Confirmemail: Log delay between email sent and confirmation (T415902), Enable confirmemail logstash channel (T415902)
  • 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
  • 13:04 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
  • 12:55 moritzm: installing Kerberos security updates
  • 12:29 moritzm: installing python3.9 security updates
  • 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
  • 12:00 reedy@deploy2002: Finished scap sync-world: Backport for Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544), CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled (duration: 06m 13s)
  • 11:56 reedy@deploy2002: reedy: Continuing with sync
  • 11:56 reedy@deploy2002: reedy: Backport for Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544), CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 11:54 reedy@deploy2002: Started scap sync-world: Backport for Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544), CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled
  • 11:44 phuedx@deploy2002: Finished scap sync-world: Backport for Hooks: Really only add global logging context for pageviews (duration: 12m 02s)
  • 11:38 phuedx@deploy2002: phuedx: Continuing with sync
  • 11:34 phuedx@deploy2002: phuedx: Backport for Hooks: Really only add global logging context for pageviews synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 11:32 phuedx@deploy2002: Started scap sync-world: Backport for Hooks: Really only add global logging context for pageviews
  • 11:29 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
  • 11:29 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
  • 11:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
  • 11:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
  • 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
  • 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
  • 10:50 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
  • 10:49 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
  • 10:45 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus4003.ulsfo.wmnet
  • 10:45 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
  • 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
  • 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
  • 10:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
  • 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus4003.ulsfo.wmnet on all recursors
  • 10:43 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache prometheus4003.ulsfo.wmnet on all recursors
  • 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
  • 10:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
  • 10:40 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
  • 10:39 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
  • 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 10:39 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus4003.ulsfo.wmnet
  • 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 10:17 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 10:12 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 09:51 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 09:46 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 09:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 09:40 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus4003.ulsfo.wmnet
  • 09:40 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 09:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus4003.ulsfo.wmnet
  • 09:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host frdb1008
  • 09:31 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host frdb1008
  • 09:29 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 09:05 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 08:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 08:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
  • 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
  • 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
  • 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
  • 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
  • 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
  • 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
  • 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
  • 08:25 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
  • 08:25 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 08:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
  • 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
  • 08:21 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 08:16 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
  • 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
  • 08:07 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo and group 1
  • 08:07 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo and group 1
  • 07:37 mszwarc@deploy2002: Finished scap sync-world: Backport for Add a script to send mandatory 2FA Echo notification (T419111), Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880) (duration: 34m 41s)
  • 07:23 mszwarc@deploy2002: mszwarc: Continuing with sync
  • 07:22 mszwarc@deploy2002: mszwarc: Backport for Add a script to send mandatory 2FA Echo notification (T419111), Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:02 mszwarc@deploy2002: Started scap sync-world: Backport for Add a script to send mandatory 2FA Echo notification (T419111), Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)
  • 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 58s)
  • 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2026-03-08

  • 20:28 vgutierrez@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on acmechief-test2001.codfw.wmnet with reason: GTS issues
  • 02:01 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 00m 59s)
  • 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2026-03-07

2026-03-06

  • 23:29 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs2009.codfw.wmnet with OS bullseye
  • 23:13 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
  • 23:07 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
  • 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs2009
  • 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2009
  • 22:46 ryankemper@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2009
  • 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs2009.codfw.wmnet 141.0.192.10.in-addr.arpa 1.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 22:46 ryankemper@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs2009.codfw.wmnet 141.0.192.10.in-addr.arpa 1.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 22:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2009 - ryankemper@cumin2002"
  • 22:45 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2009 - ryankemper@cumin2002"
  • 22:41 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
  • 22:40 ryankemper@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs2009
  • 22:39 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bullseye
  • 19:48 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 19:47 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 19:47 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 19:46 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 19:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host wdqs2009.codfw.wmnet
  • 19:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2009.codfw.wmnet
  • 19:17 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on wdqs2009.codfw.wmnet with reason: NFS might be hung, about to reboot
  • 18:56 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2043.codfw.wmnet with reason: troubleshooting for network drops
  • 18:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp2043.*
  • 18:29 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts an-backup-datanode1033.eqiad.wmnet
  • 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-backup-datanode1033.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
  • 18:28 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-backup-datanode1033.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
  • 17:59 ebernhardson@deploy2002: Finished scap sync-world: Backport for cirrus: Use https for semanticsearch-test cluster (duration: 11m 20s)
  • 17:53 ebernhardson@deploy2002: ebernhardson: Continuing with sync
  • 17:52 ebernhardson@deploy2002: ebernhardson: Backport for cirrus: Use https for semanticsearch-test cluster synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:51 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
  • 17:51 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
  • 17:47 ebernhardson@deploy2002: Started scap sync-world: Backport for cirrus: Use https for semanticsearch-test cluster
  • 17:42 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
  • 17:42 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
  • 17:40 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
  • 17:40 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
  • 17:11 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
  • 17:11 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
  • 17:10 btullis@cumin1003: START - Cookbook sre.dns.netbox
  • 17:05 hashar@deploy2002: Finished deploy [gerrit/gerrit@b8183ba]: wm-checks-api: add tooltip to the CheckRun Run action (duration: 00m 13s)
  • 17:05 hashar@deploy2002: Started deploy [gerrit/gerrit@b8183ba]: wm-checks-api: add tooltip to the CheckRun Run action
  • 17:04 btullis@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-backup-datanode1033.eqiad.wmnet
  • 16:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 16:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 16:23 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
  • 16:23 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
  • 15:57 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 15:57 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
  • 15:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 15:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2354-2356].codfw.wmnet
  • 15:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2354-2356].codfw.wmnet
  • 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2356.codfw.wmnet with OS trixie
  • 15:46 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2355.codfw.wmnet with OS trixie
  • 15:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2354.codfw.wmnet with OS trixie
  • 15:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2356.codfw.wmnet with reason: host reimage
  • 15:31 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 15:30 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 15:28 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 15:28 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
  • 15:28 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
  • 15:26 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 15:26 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 15:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2355.codfw.wmnet with reason: host reimage
  • 15:24 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 15:23 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
  • 15:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2354.codfw.wmnet with reason: host reimage
  • 15:19 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 15:19 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
  • 15:17 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 15:17 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
  • 15:17 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2356.codfw.wmnet with reason: host reimage
  • 15:16 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2355.codfw.wmnet with reason: host reimage
  • 15:16 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2354.codfw.wmnet with reason: host reimage
  • 15:15 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 15:10 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
  • 15:09 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
  • 15:08 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
  • 15:08 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
  • 15:06 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
  • 15:05 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
  • 15:05 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
  • 15:05 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
  • 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2356.codfw.wmnet with OS trixie
  • 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2355.codfw.wmnet with OS trixie
  • 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2354.codfw.wmnet with OS trixie
  • 15:02 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
  • 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2348-2353].codfw.wmnet
  • 15:02 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
  • 15:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2348-2353].codfw.wmnet
  • 14:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2353.codfw.wmnet with OS trixie
  • 14:57 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2349.codfw.wmnet with OS trixie
  • 14:57 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
  • 14:56 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
  • 14:53 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
  • 14:52 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
  • 14:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2351.codfw.wmnet with OS trixie
  • 14:49 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 14:49 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:48 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2352.codfw.wmnet with OS trixie
  • 14:48 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 14:48 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 14:48 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
  • 14:47 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
  • 14:45 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
  • 14:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2350.codfw.wmnet with OS trixie
  • 14:44 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
  • 14:43 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
  • 14:43 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
  • 14:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2348.codfw.wmnet with OS trixie
  • 14:41 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2353.codfw.wmnet with reason: host reimage
  • 14:37 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2349.codfw.wmnet with reason: host reimage
  • 14:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2351.codfw.wmnet with reason: host reimage
  • 14:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2352.codfw.wmnet with reason: host reimage
  • 14:29 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 14:28 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 14:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2350.codfw.wmnet with reason: host reimage
  • 14:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2348.codfw.wmnet with reason: host reimage
  • 14:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2351.codfw.wmnet with reason: host reimage
  • 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2352.codfw.wmnet with reason: host reimage
  • 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2353.codfw.wmnet with reason: host reimage
  • 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2350.codfw.wmnet with reason: host reimage
  • 14:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2349.codfw.wmnet with reason: host reimage
  • 14:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2348.codfw.wmnet with reason: host reimage
  • 14:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2353.codfw.wmnet with OS trixie
  • 14:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2352.codfw.wmnet with OS trixie
  • 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2351.codfw.wmnet with OS trixie
  • 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2350.codfw.wmnet with OS trixie
  • 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2349.codfw.wmnet with OS trixie
  • 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2348.codfw.wmnet with OS trixie
  • 14:03 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2349.codfw.wmnet with OS trixie
  • 14:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2349.codfw.wmnet with OS trixie
  • 14:03 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2348.codfw.wmnet with OS trixie
  • 14:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2348.codfw.wmnet with OS trixie
  • 14:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2347].codfw.wmnet
  • 14:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2347].codfw.wmnet
  • 14:01 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2347.codfw.wmnet with OS trixie
  • 13:57 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2346.codfw.wmnet with OS trixie
  • 13:55 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2343.codfw.wmnet with OS trixie
  • 13:50 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2345.codfw.wmnet with OS trixie
  • 13:48 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2344.codfw.wmnet with OS trixie
  • 13:45 dreamyjazz@deploy2002: mwscript-k8s job started: foreachwikiindblist checkuser-suggested-investigations CheckUser:queueAutoCloseSICases.php # T418591
  • 13:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2342.codfw.wmnet with OS trixie
  • 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2347.codfw.wmnet with reason: host reimage
  • 13:38 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2346.codfw.wmnet with reason: host reimage
  • 13:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2343.codfw.wmnet with reason: host reimage
  • 13:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2345.codfw.wmnet with reason: host reimage
  • 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2344.codfw.wmnet with reason: host reimage
  • 13:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2342.codfw.wmnet with reason: host reimage
  • 13:21 Dreamy_Jazz: Running foreachwikiindblist checkuser-suggested-investigations.dblist ~/PopulateSiuInfo.php --batch-size=1000 for T411118
  • 13:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2347.codfw.wmnet with reason: host reimage
  • 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2346.codfw.wmnet with reason: host reimage
  • 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2345.codfw.wmnet with reason: host reimage
  • 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2344.codfw.wmnet with reason: host reimage
  • 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2343.codfw.wmnet with reason: host reimage
  • 13:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2342.codfw.wmnet with reason: host reimage
  • 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2347.codfw.wmnet with OS trixie
  • 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2346.codfw.wmnet with OS trixie
  • 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2345.codfw.wmnet with OS trixie
  • 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2344.codfw.wmnet with OS trixie
  • 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2343.codfw.wmnet with OS trixie
  • 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2342.codfw.wmnet with OS trixie
  • 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2336-2341].codfw.wmnet
  • 13:05 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2336-2341].codfw.wmnet
  • 13:01 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2341.codfw.wmnet with OS trixie
  • 12:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2340.codfw.wmnet with OS trixie
  • 12:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2337.codfw.wmnet with OS trixie
  • 12:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2338.codfw.wmnet with OS trixie
  • 12:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2336.codfw.wmnet with OS trixie
  • 12:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2341.codfw.wmnet with reason: host reimage
  • 12:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2339.codfw.wmnet with OS trixie
  • 12:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2340.codfw.wmnet with reason: host reimage
  • 12:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2337.codfw.wmnet with reason: host reimage
  • 12:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2338.codfw.wmnet with reason: host reimage
  • 12:22 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2336.codfw.wmnet with reason: host reimage
  • 12:18 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2339.codfw.wmnet with reason: host reimage
  • 12:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2340.codfw.wmnet with reason: host reimage
  • 12:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2341.codfw.wmnet with reason: host reimage
  • 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2337.codfw.wmnet with reason: host reimage
  • 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2338.codfw.wmnet with reason: host reimage
  • 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2336.codfw.wmnet with reason: host reimage
  • 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2339.codfw.wmnet with reason: host reimage
  • 12:00 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2341.codfw.wmnet with OS trixie
  • 12:00 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2340.codfw.wmnet with OS trixie
  • 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2339.codfw.wmnet with OS trixie
  • 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2338.codfw.wmnet with OS trixie
  • 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2337.codfw.wmnet with OS trixie
  • 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2336.codfw.wmnet with OS trixie
  • 11:56 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2333-2335].codfw.wmnet
  • 11:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2333-2335].codfw.wmnet
  • 11:55 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
  • 11:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1207.eqiad.wmnet
  • 11:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2335.codfw.wmnet with OS trixie
  • 11:53 moritzm: uploaded icu 72.1-3+deb12u1~wmf11u1 to component/php83-icu72 T419058 (backport of ICU 72 from Bookworm to Bullseye, built to be co-installable with the native ICU from Bullseye)
  • 11:50 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2334.codfw.wmnet with OS trixie
  • 11:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1207.eqiad.wmnet
  • 11:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1205.eqiad.wmnet
  • 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2333.codfw.wmnet with OS trixie
  • 11:39 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
  • 11:39 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1205.eqiad.wmnet
  • 11:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2335.codfw.wmnet with reason: host reimage
  • 11:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2334.codfw.wmnet with reason: host reimage
  • 11:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2333.codfw.wmnet with reason: host reimage
  • 11:23 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2335.codfw.wmnet with reason: host reimage
  • 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2334.codfw.wmnet with reason: host reimage
  • 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2333.codfw.wmnet with reason: host reimage
  • 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2334.codfw.wmnet with OS trixie
  • 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2335.codfw.wmnet with OS trixie
  • 11:08 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2333.codfw.wmnet with OS trixie
  • 11:06 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2332.codfw.wmnet
  • 11:05 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2332.codfw.wmnet
  • 11:02 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2332.codfw.wmnet with OS trixie
  • 10:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2332.codfw.wmnet with reason: host reimage
  • 10:36 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2332.codfw.wmnet with reason: host reimage
  • 10:23 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2332.codfw.wmnet with OS trixie
  • 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1199.eqiad.wmnet
  • 10:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1194.eqiad.wmnet
  • 10:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2356].codfw.wmnet
  • 10:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1194.eqiad.wmnet
  • 10:09 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 10:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
  • 10:09 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 10:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2356].codfw.wmnet
  • 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 09:39 Emperor: repool ms-fe1013 after PXE work T401966
  • 09:23 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=pmswiki --logwiki=metawiki Wikilimes Limes.pink # T419184
  • 09:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:09 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:08 elukey@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-fe1013.eqiad.wmnet
  • 09:08 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe1013.eqiad.wmnet
  • 08:57 elukey@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe1013.eqiad.wmnet
  • 08:56 elukey@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-fe1013.eqiad.wmnet
  • 08:54 elukey@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-fe1013.eqiad.wmnet
  • 08:42 elukey@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-fe1013.eqiad.wmnet
  • 08:25 moritzm: uploaded openjdk-8 8u482-ga-1~deb12u1 to component/jdk8 of bookworm-wikimedia
  • 08:11 moritzm: imported prometheus-ganeti-exporter 0.3+deb12u2 for bookworm-wikimedia T419166
  • 06:23 ryankemper@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
  • 06:23 ryankemper@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
  • 06:23 ryankemper@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
  • 06:23 ryankemper@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
  • 06:22 ryankemper@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
  • 06:22 ryankemper@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
  • 02:59 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 02:59 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
  • 02:59 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
  • 02:56 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 02:21 zabe: zabe@deploy2002:/srv/mediawiki-staging$ foreachwiki extensions/TimedMediaHandler/maintenance/migrateTranscodeStates.php --force # T415064
  • 02:16 zabe@deploy2002: Finished scap sync-world: Backport for Update interwiki cache (duration: 06m 38s)
  • 02:12 zabe@deploy2002: mwscript-k8s job started: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https # T415978, T414241
  • 02:12 zabe@deploy2002: zabe: Continuing with sync
  • 02:11 zabe@deploy2002: zabe: Backport for Update interwiki cache synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 02:09 zabe@deploy2002: Started scap sync-world: Backport for Update interwiki cache
  • 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 23s)
  • 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 01:59 zabe@deploy2002: Finished scap sync-world: Backport for Set urwikisource to rtl (T415960) (duration: 06m 39s)
  • 01:55 zabe@deploy2002: zabe: Continuing with sync
  • 01:54 zabe@deploy2002: zabe: Backport for Set urwikisource to rtl (T415960) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 01:53 zabe@deploy2002: Started scap sync-world: Backport for Set urwikisource to rtl (T415960)
  • 01:45 zabe@deploy2002: Sync cancelled.
  • 01:43 zabe@deploy2002: zabe: Backport for Activate urwikisource (T415960) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 01:42 zabe@deploy2002: Started scap sync-world: Backport for Activate urwikisource (T415960)
  • 01:38 zabe@deploy2002: Finished scap sync-world: Backport for Prepare urwikisource (T415960) (duration: 06m 18s)
  • 01:34 zabe@deploy2002: zabe: Continuing with sync
  • 01:34 zabe@deploy2002: zabe: Backport for Prepare urwikisource (T415960) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 01:32 zabe@deploy2002: Started scap sync-world: Backport for Prepare urwikisource (T415960)
  • 01:29 zabe@deploy2002: Finished scap sync-world: Backport for Activate kaiwiki (T414234) (duration: 06m 57s)
  • 01:25 zabe@deploy2002: zabe: Continuing with sync
  • 01:24 zabe@deploy2002: zabe: Backport for Activate kaiwiki (T414234) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 01:22 zabe@deploy2002: Started scap sync-world: Backport for Activate kaiwiki (T414234)
  • 01:17 zabe@deploy2002: Finished scap sync-world: Backport for Prepare kaiwiki (T414234) (duration: 07m 25s)
  • 01:13 zabe@deploy2002: zabe: Continuing with sync
  • 01:11 zabe@deploy2002: zabe: Backport for Prepare kaiwiki (T414234) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 01:09 zabe@deploy2002: Started scap sync-world: Backport for Prepare kaiwiki (T414234)
  • 00:33 zabe@deploy2002: Finished scap sync-world: Backport for Stop writing to il_to on all wikis except commons (T415787) (duration: 06m 22s)
  • 00:29 zabe@deploy2002: zabe: Continuing with sync
  • 00:28 zabe@deploy2002: zabe: Backport for Stop writing to il_to on all wikis except commons (T415787) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 00:27 zabe@deploy2002: Started scap sync-world: Backport for Stop writing to il_to on all wikis except commons (T415787)
  • 00:05 catrope@deploy2002: Finished scap sync-world: Backport for Re-enable AllowUserJs (T419137) (duration: 08m 08s)
  • 00:01 catrope@deploy2002: catrope, kharlan: Continuing with sync

2026-03-05

  • 23:58 catrope@deploy2002: catrope, kharlan: Backport for Re-enable AllowUserJs (T419137) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:56 catrope@deploy2002: Started scap sync-world: Backport for Re-enable AllowUserJs (T419137)
  • 23:52 catrope@deploy2002: Finished scap sync-world: Backport for CSP: Update false positives list (duration: 06m 34s)
  • 23:52 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2003.wikimedia.org with OS trixie
  • 23:47 catrope@deploy2002: catrope: Continuing with sync
  • 23:47 catrope@deploy2002: catrope: Backport for CSP: Update false positives list synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:45 catrope@deploy2002: Started scap sync-world: Backport for CSP: Update false positives list
  • 23:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2003.wikimedia.org with reason: host reimage
  • 23:29 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2003.wikimedia.org with reason: host reimage
  • 23:15 zabe@deploy2002: Finished scap sync-world: Backport for Using Hadoop for MostTranscludedPages on commonswiki (T416927) (duration: 06m 27s)
  • 23:11 zabe@deploy2002: zabe: Continuing with sync
  • 23:10 zabe@deploy2002: zabe: Backport for Using Hadoop for MostTranscludedPages on commonswiki (T416927) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:09 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2003.wikimedia.org with OS trixie
  • 23:08 zabe@deploy2002: Started scap sync-world: Backport for Using Hadoop for MostTranscludedPages on commonswiki (T416927)
  • 22:45 maryum: Deployed security fix for T418254
  • 22:35 zabe@deploy2002: Finished scap sync-world: Backport for SpecialWantedFiles: Use lt_title instead of lt_to (T299953) (duration: 06m 12s)
  • 22:31 zabe@deploy2002: zabe: Continuing with sync
  • 22:30 zabe@deploy2002: zabe: Backport for SpecialWantedFiles: Use lt_title instead of lt_to (T299953) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 22:28 zabe@deploy2002: Started scap sync-world: Backport for SpecialWantedFiles: Use lt_title instead of lt_to (T299953)
  • 21:43 ebernhardson@deploy2002: Finished scap sync-world: Backport for cirrus: Align semanticsearch cluster group name with routing (T413969) (duration: 07m 20s)
  • 21:39 ebernhardson@deploy2002: ebernhardson: Continuing with sync
  • 21:38 ebernhardson@deploy2002: ebernhardson: Backport for cirrus: Align semanticsearch cluster group name with routing (T413969) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:36 ebernhardson@deploy2002: Started scap sync-world: Backport for cirrus: Align semanticsearch cluster group name with routing (T413969)
  • 21:04 jhathaway@dns1004: END - running authdns-update
  • 21:02 jhathaway@dns1004: START - running authdns-update
  • 20:53 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:52 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new service IPs for sophroid - jasmine@cumin2002"
  • 20:52 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new service IPs for sophroid - jasmine@cumin2002"
  • 20:47 jasmine@cumin2002: START - Cookbook sre.dns.netbox
  • 20:28 cdanis: apt built and imported jwt-authorizer 1.3.0-1
  • 20:16 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.18 refs T413809
  • 20:04 krinkle@deploy2002: Finished scap sync-world: Backport for Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475) (duration: 07m 37s)
  • 20:00 krinkle@deploy2002: krinkle: Continuing with sync
  • 19:58 krinkle@deploy2002: krinkle: Backport for Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 19:56 krinkle@deploy2002: Started scap sync-world: Backport for Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)
  • 19:21 sbassett@deploy2002: Finished scap sync-world: Backport for Re-enable Site JS (T419137 T419138) (duration: 06m 57s)
  • 19:17 sbassett@deploy2002: sbassett: Continuing with sync
  • 19:16 sbassett@deploy2002: sbassett: Backport for Re-enable Site JS (T419137 T419138) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 19:15 sbassett@deploy2002: Started scap sync-world: Backport for Re-enable Site JS (T419137 T419138)
  • 19:04 dr0ptp4kt: Deploying change 1239200 for refinery ( T416481 ) using scap, then deployed onto hdfs
  • 19:03 dr0ptp4kt: Deployed refinery change 1240253 ( T414478 ), 1240253 (no-op) for refinery ( T414478 ) using scap, then deployed onto hdfs
  • 18:58 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1] (thin): Regular analytics weekly train THIN [analytics/refinery@dd641b15] (duration: 02m 02s)
  • 18:56 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1] (thin): Regular analytics weekly train THIN [analytics/refinery@dd641b15]
  • 18:55 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1]: Regular analytics weekly train [analytics/refinery@dd641b15] (duration: 04m 18s)
  • 18:50 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1]: Regular analytics weekly train [analytics/refinery@dd641b15]
  • 18:49 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@dd641b15] (duration: 01m 57s)
  • 18:47 dr0ptp4kt: Deploying change 1239200 for refinery ( T416481 )
  • 18:47 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@dd641b15]
  • 18:31 eevans@dns1004: END - running authdns-update
  • 18:30 eevans@dns1004: START - running authdns-update
  • 18:30 sukhe: sudo cumin -b51 "A:cp" "run-puppet-agent --enable 'rolling out 1248544'"
  • 18:16 sukhe: sudo cumin "A:cp" "disable-puppet 'rolling out 1248544'"
  • 18:06 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:06 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
  • 18:06 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
  • 18:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 17:31 mszwarc@deploy2002: Finished scap sync-world: Backport for Enable wgUseSiteJs on donatewiki (T419138) (duration: 09m 57s)
  • 17:27 mszwarc@deploy2002: mszwarc, krinkle: Continuing with sync
  • 17:26 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2003.wikimedia.org with OS bookworm
  • 17:26 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 17:23 mszwarc@deploy2002: mszwarc, krinkle: Backport for Enable wgUseSiteJs on donatewiki (T419138) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:21 mszwarc@deploy2002: Started scap sync-world: Backport for Enable wgUseSiteJs on donatewiki (T419138)
  • 17:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
  • 17:16 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 17:12 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1162.eqiad.wmnet
  • 17:12 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1162.eqiad.wmnet
  • 17:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker1162.eqiad.wmnet
  • 17:10 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker1162.eqiad.wmnet
  • 17:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
  • 17:05 taavi@cumin1003: dbctl commit (dc=all): 'enable writes', diff saved to https://phabricator.wikimedia.org/P89812 and previous config saved to /var/cache/conftool/dbconfig/20260305-170556-taavi.json
  • 16:03 oblivian@cumin1003: dbctl commit (dc=all): 'read only s6', diff saved to https://phabricator.wikimedia.org/P89810 and previous config saved to /var/cache/conftool/dbconfig/20260305-160348-oblivian.json
  • 15:32 taavi@cumin1003: dbctl commit (dc=all): 'set global ro', diff saved to https://phabricator.wikimedia.org/P89808 and previous config saved to /var/cache/conftool/dbconfig/20260305-153203-taavi.json
  • 15:31 mszwarc@deploy2002: mszwarc: Continuing with sync
  • 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1178.eqiad.wmnet
  • 15:31 mszwarc@deploy2002: mszwarc: Backport for Disable custom JS for a moment synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:29 mszwarc@deploy2002: Started scap sync-world: Backport for Disable custom JS for a moment
  • 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2003']
  • 15:25 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2003']
  • 15:23 ebernhardson@deploy2002: Finished scap sync-world: Backport for cirrus: Correct semantic builder config (T413969) (duration: 07m 39s)
  • 15:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 15:19 ebernhardson@deploy2002: ebernhardson: Continuing with sync
  • 15:18 ebernhardson@deploy2002: ebernhardson: Backport for cirrus: Correct semantic builder config (T413969) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:16 ebernhardson@deploy2002: Started scap sync-world: Backport for cirrus: Correct semantic builder config (T413969)
  • 15:11 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
  • 15:10 ebernhardson@deploy2002: Finished scap sync-world: Backport for cirrus: Add semantic search test cluster (T413969) (duration: 09m 18s)
  • 15:06 ebernhardson@deploy2002: ebernhardson: Continuing with sync
  • 15:04 sukhe@dns1004: END - running authdns-update
  • 15:03 sukhe@dns1004: START - running authdns-update
  • 15:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 15:02 ebernhardson@deploy2002: ebernhardson: Backport for cirrus: Add semantic search test cluster (T413969) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:02 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
  • 15:02 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
  • 15:00 ebernhardson@deploy2002: Started scap sync-world: Backport for cirrus: Add semantic search test cluster (T413969)
  • 14:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
  • 14:53 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
  • 14:50 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 14:38 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
  • 14:38 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
  • 14:32 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
  • 14:32 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
  • 14:32 sukhe@dns1004: END - running authdns-update
  • 14:30 sukhe@dns1004: START - running authdns-update
  • 14:28 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
  • 14:28 sukhe@dns1004: START - running authdns-update
  • 14:27 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1231.eqiad.wmnet
  • 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1230.eqiad.wmnet
  • 14:24 bking@dns1004: START - running authdns-update
  • 14:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1230.eqiad.wmnet
  • 14:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1229.eqiad.wmnet
  • 14:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
  • 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 14:05 moritzm: imported nodejs 24.14.0-1nodesource1 to thirdparty/node24 T418440
  • 14:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1229.eqiad.wmnet
  • 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1228.eqiad.wmnet
  • 14:01 moritzm: initialised ganeti02/ulsfo cluster T418993
  • 13:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 13:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 13:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1228.eqiad.wmnet
  • 13:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1227.eqiad.wmnet
  • 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 13:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 13:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:46 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 13:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:42 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1199.eqiad.wmnet
  • 13:40 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1227.eqiad.wmnet
  • 13:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1226.eqiad.wmnet
  • 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
  • 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
  • 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
  • 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
  • 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 13:35 moritzm: installing glib2.0 security updates
  • 13:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
  • 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
  • 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1226.eqiad.wmnet
  • 13:26 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1225.eqiad.wmnet
  • 13:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 13:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1225.eqiad.wmnet
  • 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1224.eqiad.wmnet
  • 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm
  • 13:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new VIP for routed ganeti in ulsfo - jmm@cumin2002"
  • 13:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new VIP for routed ganeti in ulsfo - jmm@cumin2002"
  • 13:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 13:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 13:02 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1224.eqiad.wmnet
  • 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1223.eqiad.wmnet
  • 13:00 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 12:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 12:58 cgoubert@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on wikikube-worker1162.eqiad.wmnet with reason: dcops intervention
  • 12:57 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1162.eqiad.wmnet
  • 12:56 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1162.eqiad.wmnet
  • 12:55 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
  • 12:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1223.eqiad.wmnet
  • 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1222.eqiad.wmnet
  • 12:46 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
  • 12:43 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
  • 12:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1222.eqiad.wmnet
  • 12:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1221.eqiad.wmnet
  • 12:23 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1221.eqiad.wmnet
  • 12:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1220.eqiad.wmnet
  • 12:23 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm
  • 12:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1220.eqiad.wmnet
  • 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet
  • 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet
  • 11:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1236.eqiad.wmnet
  • 11:29 moritzm: remove ganeti4006 from ganeti/ulsfo cluster T418993
  • 11:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1236.eqiad.wmnet
  • 11:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1235.eqiad.wmnet
  • 11:16 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1235.eqiad.wmnet
  • 11:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1234.eqiad.wmnet
  • 11:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1234.eqiad.wmnet
  • 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1233.eqiad.wmnet
  • 11:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1233.eqiad.wmnet
  • 11:02 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1232.eqiad.wmnet
  • 11:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 11:00 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
  • 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4005.ulsfo.wmnet with OS bookworm
  • 10:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 10:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1232.eqiad.wmnet
  • 10:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1231.eqiad.wmnet
  • 10:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 10:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1231.eqiad.wmnet
  • 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1230.eqiad.wmnet
  • 10:41 elukey@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
  • 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4005.ulsfo.wmnet with reason: host reimage
  • 10:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1230.eqiad.wmnet
  • 10:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1229.eqiad.wmnet
  • 10:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4005.ulsfo.wmnet with reason: host reimage
  • 10:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1229.eqiad.wmnet
  • 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1228.eqiad.wmnet
  • 10:24 moritzm: installing Java 8 security updates
  • 10:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1228.eqiad.wmnet
  • 10:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1227.eqiad.wmnet
  • 10:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1227.eqiad.wmnet
  • 10:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1226.eqiad.wmnet
  • 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
  • 10:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4005.ulsfo.wmnet with OS bookworm
  • 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ganeti4005.ulsfo.wmnet
  • 10:08 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti4005.ulsfo.wmnet
  • 10:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add gw-virtual.ulsfo.wmnet - ayounsi@cumin1003"
  • 10:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1226.eqiad.wmnet
  • 10:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1225.eqiad.wmnet
  • 09:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1225.eqiad.wmnet
  • 09:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1224.eqiad.wmnet
  • 09:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1224.eqiad.wmnet
  • 09:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1223.eqiad.wmnet
  • 09:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1223.eqiad.wmnet
  • 09:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1222.eqiad.wmnet
  • 09:43 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add gw-virtual.ulsfo.wmnet - ayounsi@cumin1003"
  • 09:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1222.eqiad.wmnet
  • 09:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1221.eqiad.wmnet
  • 09:32 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:32 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:28 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1221.eqiad.wmnet
  • 09:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1220.eqiad.wmnet
  • 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1220.eqiad.wmnet
  • 09:02 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
  • 08:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
  • 08:38 mszwarc@deploy2002: Finished scap sync-world: Backport for Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580) (duration: 07m 07s)
  • 08:35 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 08:35 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 08:34 mszwarc@deploy2002: mszwarc: Continuing with sync
  • 08:33 mszwarc@deploy2002: mszwarc: Backport for Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:30 mszwarc@deploy2002: Started scap sync-world: Backport for Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)
  • 08:29 gehel@dns1004: END - running authdns-update
  • 08:28 gehel@dns1004: START - running authdns-update
  • 08:27 moritzm: installing mbedtls security updates
  • 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 08:15 hashar@deploy2002: Finished scap sync-world: Backport for Revert "zhwiki: Add 2026 CNY celebration logos" (duration: 09m 19s)
  • 08:11 hashar@deploy2002: hashar, stang: Continuing with sync
  • 08:08 hashar@deploy2002: hashar, stang: Backport for Revert "zhwiki: Add 2026 CNY celebration logos" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:06 hashar@deploy2002: Started scap sync-world: Backport for Revert "zhwiki: Add 2026 CNY celebration logos"
  • 08:02 moritzm: uploaded openjdk-8 8u482-ga-1~deb11u1 to component/jdk8 of bullseye-wikimedia
  • 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast4005.wikimedia.org
  • 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast4005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 07:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast4005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 07:48 moritzm: uploaded bird2 2.18-1~wmf13u2 to the main component of trixie-wikimedia T413740
  • 07:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 07:47 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 07:42 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast4005.wikimedia.org
  • 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Remove es1033 T408772', diff saved to https://phabricator.wikimedia.org/P89804 and previous config saved to /var/cache/conftool/dbconfig/20260305-063548-marostegui.json
  • 02:10 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 55s)
  • 02:02 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 02:01 zabe@deploy2002: Finished scap sync-world: Backport for Stop writing to il_to on medium size wikis (T415787) (duration: 06m 14s)
  • 01:58 zabe@deploy2002: zabe: Continuing with sync
  • 01:57 zabe@deploy2002: zabe: Backport for Stop writing to il_to on medium size wikis (T415787) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 01:55 zabe@deploy2002: Started scap sync-world: Backport for Stop writing to il_to on medium size wikis (T415787)
  • 01:40 zabe@deploy2002: Finished scap sync-world: Backport for Start reading from new file tables on medium wikis (T416548) (duration: 06m 15s)
  • 01:36 zabe@deploy2002: zabe: Continuing with sync
  • 01:36 zabe@deploy2002: zabe: Backport for Start reading from new file tables on medium wikis (T416548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 01:34 zabe@deploy2002: Started scap sync-world: Backport for Start reading from new file tables on medium wikis (T416548)
  • 01:29 zabe@deploy2002: Finished scap sync-world: Backport for ImageListPager: Use correct name field for batch lookups (T418327), Revert^2 "ImageListPager: Properly support file schema migration read new" (duration: 07m 21s)
  • 01:25 zabe@deploy2002: zabe: Continuing with sync
  • 01:23 zabe@deploy2002: zabe: Backport for ImageListPager: Use correct name field for batch lookups (T418327), Revert^2 "ImageListPager: Properly support file schema migration read new" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 01:21 zabe@deploy2002: Started scap sync-world: Backport for ImageListPager: Use correct name field for batch lookups (T418327), Revert^2 "ImageListPager: Properly support file schema migration read new"
  • 00:55 zabe@deploy2002: Finished scap sync-world: Backport for Stop writing to il_to on small wikis (T415787) (duration: 06m 49s)
  • 00:51 zabe@deploy2002: zabe: Continuing with sync
  • 00:50 zabe@deploy2002: zabe: Backport for Stop writing to il_to on small wikis (T415787) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 00:48 zabe@deploy2002: Started scap sync-world: Backport for Stop writing to il_to on small wikis (T415787)
  • 00:19 zabe@deploy2002: Finished scap sync-world: Backport for NewFilesPager: Properly support file schema migration read new (T419062), NewFilesPager: Properly support file schema migration read new (T419062) (duration: 08m 52s)
  • 00:13 zabe@deploy2002: zabe: Continuing with sync
  • 00:12 zabe@deploy2002: zabe: Backport for NewFilesPager: Properly support file schema migration read new (T419062), NewFilesPager: Properly support file schema migration read new (T419062) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 00:10 zabe@deploy2002: Started scap sync-world: Backport for NewFilesPager: Properly support file schema migration read new (T419062), NewFilesPager: Properly support file schema migration read new (T419062)

2026-03-04

  • 22:57 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 22:56 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 22:55 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 22:55 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 22:55 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 22:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 22:35 tgr_: UTC late deploys done
  • 22:33 tgr@deploy2002: Finished scap sync-world: Backport for Introduce a Semantic Search query route and builder (T413969), Wire up semantic query building (T413969) (duration: 38m 28s)
  • 22:16 tgr@deploy2002: tgr, ebernhardson: Continuing with sync
  • 22:14 tgr@deploy2002: tgr, ebernhardson: Backport for Introduce a Semantic Search query route and builder (T413969), Wire up semantic query building (T413969) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:54 tgr@deploy2002: Started scap sync-world: Backport for Introduce a Semantic Search query route and builder (T413969), Wire up semantic query building (T413969)
  • 21:48 tgr@deploy2002: Finished scap sync-world: Backport for Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999) (duration: 07m 05s)
  • 21:47 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on dse-k8s-worker1028.eqiad.wmnet with reason: broken networking
  • 21:44 tgr@deploy2002: tgr: Continuing with sync
  • 21:43 tgr@deploy2002: tgr: Backport for Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:40 tgr@deploy2002: Started scap sync-world: Backport for Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)
  • 21:36 tgr@deploy2002: Finished scap sync-world: Backport for Add synthetic AAA experiment (T418614), Add synthetic AAA experiment (T418614) (duration: 09m 11s)
  • 21:35 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
  • 21:32 tgr@deploy2002: cjming, tgr: Continuing with sync
  • 21:30 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
  • 21:29 tgr@deploy2002: cjming, tgr: Backport for Add synthetic AAA experiment (T418614), Add synthetic AAA experiment (T418614) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:27 tgr@deploy2002: Started scap sync-world: Backport for Add synthetic AAA experiment (T418614), Add synthetic AAA experiment (T418614)
  • 21:21 tgr@deploy2002: Finished scap sync-world: Backport for logging: set poolcounter channel log level to info (T418612) (duration: 09m 04s)
  • 21:17 tgr@deploy2002: tgr, cwhite: Continuing with sync
  • 21:14 tgr@deploy2002: tgr, cwhite: Backport for logging: set poolcounter channel log level to info (T418612) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:12 tgr@deploy2002: Started scap sync-world: Backport for logging: set poolcounter channel log level to info (T418612)
  • 21:07 tgr@deploy2002: Finished scap sync-world: Backport for Fix $wgJwtSessionCookieIssuer (T415007 T418999) (duration: 09m 55s)
  • 21:03 tgr@deploy2002: tgr: Continuing with sync
  • 20:59 tgr@deploy2002: tgr: Backport for Fix $wgJwtSessionCookieIssuer (T415007 T418999) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:57 tgr@deploy2002: Started scap sync-world: Backport for Fix $wgJwtSessionCookieIssuer (T415007 T418999)
  • 19:56 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.18 refs T413809
  • 19:44 jhuneidi@deploy2002: Finished scap sync-world: Backport for CategoryViewer: Fall back to empty string in case of missing nextpage (T418934) (duration: 10m 47s)
  • 19:44 cdobbins@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=cp205[0-8].codfw.wmnet
  • 19:43 cdobbins@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=cp2049.codfw.wmnet
  • 19:40 jhuneidi@deploy2002: zabe, jhuneidi: Continuing with sync
  • 19:35 jhuneidi@deploy2002: zabe, jhuneidi: Backport for CategoryViewer: Fall back to empty string in case of missing nextpage (T418934) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 19:34 brett@puppetserver1001: conftool action : set/weight=1; selector: name=cp2043.*
  • 19:34 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
  • 19:33 jhuneidi@deploy2002: Started scap sync-world: Backport for CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)
  • 19:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2043.codfw.wmnet with OS trixie
  • 19:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
  • 19:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
  • 19:22 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
  • 19:22 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
  • 19:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2043.codfw.wmnet with reason: host reimage
  • 19:06 brett@puppetserver1001: conftool action : set/weight=1; selector: name=cp204[45678].*
  • 19:04 brett@puppetserver1001: conftool action : set/weight=100; selector: name=cp204[45678].*
  • 19:02 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2043.codfw.wmnet with reason: host reimage
  • 18:58 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp204[45678].*
  • 18:52 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 18:51 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 18:50 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 18:50 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:49 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 18:48 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 18:48 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 18:47 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 18:47 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 18:47 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS trixie
  • 18:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 18:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 18:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 18:41 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 18:41 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 18:39 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 18:39 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 18:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 18:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 18:37 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:32 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:16 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 18:16 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 18:16 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 18:15 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 18:15 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:14 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:14 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 18:13 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 18:13 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 18:13 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 18:12 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 18:12 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 17:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2047.codfw.wmnet with OS trixie
  • 17:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
  • 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
  • 17:23 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 17:23 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 17:18 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 17:18 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 17:15 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 17:13 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 17:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp2047.codfw.wmnet with OS trixie
  • 16:55 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 16:55 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 16:54 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 16:54 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 16:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1007.eqiad.wmnet with OS bookworm
  • 16:39 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-unlock-scap (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:39 root@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter switchover from eqiad to codfw - T418133 (duration: 25m 37s)
  • 16:39 root@deploy2002: Forcefully removing global lock: Datacenter switchover from eqiad to codfw - T418133
  • 16:39 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-unlock-scap for datacenter switchover from eqiad to codfw
  • 16:39 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
  • 16:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
  • 16:27 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from eqiad to codfw
  • 16:27 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:26 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from eqiad to codfw
  • 16:26 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:26 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
  • 16:26 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
  • 16:26 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
  • 16:26 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
  • 16:26 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from eqiad to codfw
  • 16:25 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:25 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: sync
  • 16:25 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: sync
  • 16:25 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from eqiad to codfw
  • 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:24 blake@cumin1003: [DRY-RUN] MediaWiki read-only period ends at: 2026-03-04 16:24:40.502004
  • 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite for datacenter switchover from eqiad to codfw
  • 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
  • 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite for datacenter switchover from eqiad to codfw
  • 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki for datacenter switchover from eqiad to codfw
  • 16:23 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:23 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly for datacenter switchover from eqiad to codfw
  • 16:23 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:22 blake@cumin1003: [DRY-RUN] MediaWiki read-only period starts at: 2026-03-04 16:22:41.755892
  • 16:22 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.02-set-readonly for datacenter switchover from eqiad to codfw
  • 16:20 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
  • 16:20 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:20 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from eqiad to codfw
  • 16:19 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:14 moritzm: upgrading cloudservices* to Bird 2.18 T413740
  • 16:14 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from eqiad to codfw
  • 16:13 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-lock-scap (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:13 root@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter switchover from eqiad to codfw - T418133
  • 16:13 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-lock-scap for datacenter switchover from eqiad to codfw
  • 16:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
  • 16:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
  • 16:10 moritzm: remove ganeti4005 from ganeti/ulsfo cluster T418993
  • 16:10 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1007.eqiad.wmnet with OS bookworm
  • 16:06 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0) for datacenter switchover from eqiad to codfw
  • 16:06 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from eqiad to codfw
  • 15:59 XioNoX: push pfw policies - T418402
  • 15:37 sukhe@dns1004: END - running authdns-update
  • 15:36 sukhe@dns1004: START - running authdns-update
  • 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1219.eqiad.wmnet
  • 15:32 aqu@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 15:31 aqu@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
  • 15:29 cgoubert@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P{ms-fe10[14-24].*} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
  • 15:24 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P{ms-fe10[14-24].*} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
  • 15:22 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 15:22 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 15:22 cgoubert@cumin1003: END (ERROR) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=97) rolling restart_daemons on A:swift-fe-eqiad
  • 15:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1219.eqiad.wmnet
  • 15:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1218.eqiad.wmnet
  • 15:19 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
  • 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1120.eqiad.wmnet
  • 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1121.eqiad.wmnet
  • 15:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1115.eqiad.wmnet [reason: T418772 - BGP maintenance]
  • 15:16 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1122.eqiad.wmnet
  • 15:15 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:15 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:14 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:13 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:13 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:10 XioNoX: lsw1-d7-eqiad# tools network-instance default protocols bgp neighbor 10.64.128.17 reset-peer - T418772
  • 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
  • 15:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1218.eqiad.wmnet
  • 15:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1217.eqiad.wmnet
  • 15:09 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:08 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:08 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:05 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:05 moritzm: upgrading cloudlb* to Bird 2.18 T413740
  • 15:05 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:04 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:58 Dreamy_Jazz: Afternoon UTC backport window done
  • 14:58 dreamyjazz@deploy2002: Finished scap sync-world: Backport for zhwiki: Remove all rights from accountcreator (T418089) (duration: 08m 12s)
  • 14:57 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 14:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1217.eqiad.wmnet
  • 14:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1216.eqiad.wmnet
  • 14:57 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 14:56 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on dse-k8s-worker[1010-1011,1013,1018-1019].eqiad.wmnet with reason: Adding 10 Gbps NIC
  • 14:54 dreamyjazz@deploy2002: dreamyjazz, 1f616emo: Continuing with sync
  • 14:52 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
  • 14:52 dreamyjazz@deploy2002: dreamyjazz, 1f616emo: Backport for zhwiki: Remove all rights from accountcreator (T418089) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:50 dreamyjazz@deploy2002: Started scap sync-world: Backport for zhwiki: Remove all rights from accountcreator (T418089)
  • 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1216.eqiad.wmnet
  • 14:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1215.eqiad.wmnet
  • 14:44 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Hooks: Fix liquidthreads log type definition bugs (T417425 T419006), Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425) (duration: 07m 11s)
  • 14:44 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1115.eqiad.wmnet [reason: T418772 - BGP maintenance]
  • 14:44 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/970275
  • 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1122.eqiad.wmnet
  • 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1121.eqiad.wmnet
  • 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1120.eqiad.wmnet
  • 14:40 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 14:39 dreamyjazz@deploy2002: dreamyjazz: Backport for Hooks: Fix liquidthreads log type definition bugs (T417425 T419006), Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:37 dreamyjazz@deploy2002: Started scap sync-world: Backport for Hooks: Fix liquidthreads log type definition bugs (T417425 T419006), Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)
  • 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1215.eqiad.wmnet
  • 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1214.eqiad.wmnet
  • 14:32 btullis@puppetserver1001: conftool action : get/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
  • 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
  • 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1025.eqiad.wmnet
  • 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
  • 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
  • 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1025.eqiad.wmnet
  • 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
  • 14:30 btullis@puppetserver1001: conftool action : get/pooled; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
  • 14:29 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams and A:cp - 3.0 upgrade ()
  • 14:27 arnaudb@dns1004: END - running authdns-update
  • 14:26 arnaudb@dns1004: START - running authdns-update
  • 14:26 tgr@deploy2002: Finished scap sync-world: Backport for Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999) (duration: 07m 19s)
  • 14:22 tgr@deploy2002: tgr: Continuing with sync
  • 14:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1214.eqiad.wmnet
  • 14:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1213.eqiad.wmnet
  • 14:21 tgr@deploy2002: tgr: Backport for Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:19 tgr@deploy2002: Started scap sync-world: Backport for Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)
  • 14:14 sgimeno@deploy2002: Finished scap sync-world: Backport for Enable new HTML confirmation emails for all (T416748) (duration: 07m 46s)
  • 14:13 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 14:13 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 14:10 sgimeno@deploy2002: migr, sgimeno: Continuing with sync
  • 14:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1213.eqiad.wmnet
  • 14:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1212.eqiad.wmnet
  • 14:09 sgimeno@deploy2002: migr, sgimeno: Backport for Enable new HTML confirmation emails for all (T416748) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:08 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 14:08 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 14:08 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 14:07 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 14:07 sgimeno@deploy2002: Started scap sync-world: Backport for Enable new HTML confirmation emails for all (T416748)
  • 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
  • 13:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
  • 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
  • 13:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1212.eqiad.wmnet
  • 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1211.eqiad.wmnet
  • 13:49 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams and A:cp - 3.0 upgrade ()
  • 13:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1211.eqiad.wmnet
  • 13:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1210.eqiad.wmnet
  • 13:43 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams and A:cp - 3.0 upgrade ()
  • 13:40 arnaudb@dns1004: END - running authdns-update
  • 13:39 arnaudb@dns1004: START - running authdns-update
  • 13:37 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
  • 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1210.eqiad.wmnet
  • 13:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1209.eqiad.wmnet
  • 13:20 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1209.eqiad.wmnet
  • 13:20 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1208.eqiad.wmnet
  • 13:17 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 13:17 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
  • 13:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:15 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1208.eqiad.wmnet
  • 13:06 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1207.eqiad.wmnet
  • 13:03 arnaudb@dns1005: END - running authdns-update
  • 13:02 arnaudb@dns1005: START - running authdns-update
  • 13:00 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams and A:cp - 3.0 upgrade ()
  • 13:00 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
  • 12:46 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 12:45 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 12:44 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 12:44 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 12:43 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:43 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 12:33 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:29 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 12:10 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
  • 12:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 3.0 upgrade ()
  • 12:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1207.eqiad.wmnet
  • 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1206.eqiad.wmnet
  • 11:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1206.eqiad.wmnet
  • 11:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1205.eqiad.wmnet
  • 11:36 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f8-eqiad
  • 11:36 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
  • 11:34 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 3.0 upgrade ()
  • 11:34 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
  • 11:28 dreamyjazz@deploy2002: Finished scap sync-world: Backport for SI: Update instrumentation schema (T418293) (duration: 16m 22s)
  • 11:22 fabfur: start upgrading haproxy to 3.0 on A:cp-eqiad (T417253)
  • 11:22 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 11:17 dreamyjazz@deploy2002: dreamyjazz: Backport for SI: Update instrumentation schema (T418293) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 11:13 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp - 3.0 upgrade ()
  • 11:12 dreamyjazz@deploy2002: Started scap sync-world: Backport for SI: Update instrumentation schema (T418293)
  • 11:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp - 3.0 upgrade ()
  • 11:07 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P{wikikube-worker[2332-2356].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
  • 11:07 blake@cumin1003: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[2332-2356].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
  • 11:06 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2356].codfw.wmnet
  • 11:06 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2356].codfw.wmnet
  • 11:03 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P{wikikube-worker[2332-2356].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
  • 11:03 blake@cumin1003: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[2332-2356].codfw.wmnet} and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
  • 10:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2356].codfw.wmnet
  • 10:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2356].codfw.wmnet
  • 10:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1205.eqiad.wmnet
  • 10:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1204.eqiad.wmnet
  • 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
  • 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
  • 10:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1204.eqiad.wmnet
  • 10:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1203.eqiad.wmnet
  • 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
  • 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
  • 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1203.eqiad.wmnet
  • 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1202.eqiad.wmnet
  • 10:25 fabfur: start upgrading haproxy to 3.0 on A:cp-drmrs (T417253)
  • 10:25 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp - 3.0 upgrade ()
  • 10:25 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp - 3.0 upgrade ()
  • 10:24 ladsgroup@deploy2002: Finished scap sync-world: Backport for WebPHandler: Allow the original being served on the web (T414805 T418745 T418346), WebPHandler: Allow the original being served on the web (T414805 T418745 T418346) (duration: 06m 42s)
  • 10:22 arnaudb@dns1004: END - running authdns-update
  • 10:20 arnaudb@dns1004: START - running authdns-update
  • 10:20 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 10:20 ladsgroup@deploy2002: ladsgroup: Backport for WebPHandler: Allow the original being served on the web (T414805 T418745 T418346), WebPHandler: Allow the original being served on the web (T414805 T418745 T418346) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 10:18 ladsgroup@deploy2002: Started scap sync-world: Backport for WebPHandler: Allow the original being served on the web (T414805 T418745 T418346), WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)
  • 10:16 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1202.eqiad.wmnet
  • 10:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1201.eqiad.wmnet
  • 10:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
  • 10:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
  • 10:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
  • 10:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
  • 10:04 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1201.eqiad.wmnet
  • 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1200.eqiad.wmnet
  • 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
  • 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
  • 09:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1200.eqiad.wmnet
  • 09:39 mszwarc@deploy2002: Finished scap sync-world: Backport for Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880) (duration: 08m 23s)
  • 09:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw and A:cp - 3.0 upgrade ()
  • 09:35 mszwarc@deploy2002: mszwarc: Continuing with sync
  • 09:33 mszwarc@deploy2002: mszwarc: Backport for Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki-root1002.eqiad.wmnet
  • 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
  • 09:31 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp - 3.0 upgrade ()
  • 09:31 mszwarc@deploy2002: Started scap sync-world: Backport for Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)
  • 09:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
  • 09:20 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
  • 09:20 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts pki-root1002.eqiad.wmnet
  • 09:20 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
  • 09:20 jmm@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts pki-root1002.eqiad.wmnet
  • 09:03 gehel: switching off Blazegraph on wdqs2009 (legacy full graph endpoint is end of life) - T411410 / T415073
  • 09:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
  • 09:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
  • 09:02 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
  • 09:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
  • 09:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
  • 08:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
  • 08:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
  • 08:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
  • 08:56 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
  • 08:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
  • 08:52 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
  • 08:49 topranks: disabling IBGP session between ssw1-d1-eqiad and ssw1-d8-eqiad to remove backup paths try #2 T411054
  • 08:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on backup1007.eqiad.wmnet,dbprov1004.eqiad.wmnet with reason: network maintenance
  • 08:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 08:31 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 08:21 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp - 3.0 upgrade ()
  • 08:21 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw and A:cp - 3.0 upgrade ()
  • 08:11 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5032.*
  • 07:54 topranks: disabling IBGP session between ssw1-d1-eqiad and ssw1-d8-eqiad to remove backup paths T411054
  • 07:43 moritzm: installing libbpf updates from Bookworm point release
  • 05:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 05:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 04s)
  • 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 01:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 01:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T418465)', diff saved to https://phabricator.wikimedia.org/P89793 and previous config saved to /var/cache/conftool/dbconfig/20260304-015657-marostegui.json
  • 01:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P89792 and previous config saved to /var/cache/conftool/dbconfig/20260304-014150-marostegui.json
  • 01:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P89791 and previous config saved to /var/cache/conftool/dbconfig/20260304-012642-marostegui.json
  • 01:23 zabe@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
  • 01:22 zabe@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
  • 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T418465)', diff saved to https://phabricator.wikimedia.org/P89790 and previous config saved to /var/cache/conftool/dbconfig/20260304-011134-marostegui.json
  • 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1263 (T418465)', diff saved to https://phabricator.wikimedia.org/P89789 and previous config saved to /var/cache/conftool/dbconfig/20260304-004638-marostegui.json
  • 00:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1263.eqiad.wmnet with reason: Maintenance
  • 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (T418465)', diff saved to https://phabricator.wikimedia.org/P89788 and previous config saved to /var/cache/conftool/dbconfig/20260304-004615-marostegui.json
  • 00:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P89787 and previous config saved to /var/cache/conftool/dbconfig/20260304-003107-marostegui.json
  • 00:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P89786 and previous config saved to /var/cache/conftool/dbconfig/20260304-001559-marostegui.json
  • 00:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (T418465)', diff saved to https://phabricator.wikimedia.org/P89785 and previous config saved to /var/cache/conftool/dbconfig/20260304-000052-marostegui.json

2026-03-03

  • 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1262 (T418465)', diff saved to https://phabricator.wikimedia.org/P89784 and previous config saved to /var/cache/conftool/dbconfig/20260303-233500-marostegui.json
  • 23:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1262.eqiad.wmnet with reason: Maintenance
  • 23:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T418465)', diff saved to https://phabricator.wikimedia.org/P89783 and previous config saved to /var/cache/conftool/dbconfig/20260303-233436-marostegui.json
  • 23:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P89782 and previous config saved to /var/cache/conftool/dbconfig/20260303-231929-marostegui.json
  • 23:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
  • 23:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
  • 23:08 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
  • 23:07 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
  • 23:05 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 23:05 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 23:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P89781 and previous config saved to /var/cache/conftool/dbconfig/20260303-230421-marostegui.json
  • 23:04 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
  • 23:02 tgr@deploy2002: Finished scap sync-world: Backport for Do not invalidate anon sessions with non-anon JWT cookies (T415007), Do not invalidate anon sessions with non-anon JWT cookies (T415007), Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007) (duration: 21m 47s)
  • 23:00 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7008.magru.wmnet [reason: lldpd packet drop issues]
  • 22:58 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7008 [reason: lldpd packet drop issues]
  • 22:58 tgr@deploy2002: tgr: Continuing with sync
  • 22:56 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
  • 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T418465)', diff saved to https://phabricator.wikimedia.org/P89780 and previous config saved to /var/cache/conftool/dbconfig/20260303-224913-marostegui.json
  • 22:45 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 22:45 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 22:44 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 22:44 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 22:42 tgr@deploy2002: tgr: Backport for Do not invalidate anon sessions with non-anon JWT cookies (T415007), Do not invalidate anon sessions with non-anon JWT cookies (T415007), Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 22:40 tgr@deploy2002: Started scap sync-world: Backport for Do not invalidate anon sessions with non-anon JWT cookies (T415007), Do not invalidate anon sessions with non-anon JWT cookies (T415007), Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)
  • 22:26 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
  • 22:26 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
  • 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1261 (T418465)', diff saved to https://phabricator.wikimedia.org/P89779 and previous config saved to /var/cache/conftool/dbconfig/20260303-222324-marostegui.json
  • 22:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1261.eqiad.wmnet with reason: Maintenance
  • 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T418465)', diff saved to https://phabricator.wikimedia.org/P89778 and previous config saved to /var/cache/conftool/dbconfig/20260303-222301-marostegui.json
  • 22:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P89777 and previous config saved to /var/cache/conftool/dbconfig/20260303-220754-marostegui.json
  • 21:59 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1245162 T411807 (duration: 12m 15s)
  • 21:58 rzl@deploy2002: rzl: Continuing with sync
  • 21:56 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 21:56 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 21:55 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1245162 T411807 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:54 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1245162 T411807
  • 21:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P89776 and previous config saved to /var/cache/conftool/dbconfig/20260303-215247-marostegui.json
  • 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 (T418465)', diff saved to https://phabricator.wikimedia.org/P89775 and previous config saved to /var/cache/conftool/dbconfig/20260303-214931-marostegui.json
  • 21:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2045.codfw.wmnet
  • 21:48 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp2045.codfw.wmnet
  • 21:40 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 21:39 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 21:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T418465)', diff saved to https://phabricator.wikimedia.org/P89774 and previous config saved to /var/cache/conftool/dbconfig/20260303-213739-marostegui.json
  • 21:35 jhuneidi@deploy2002: Finished scap sync-world: Backport for REST: show the beta Attribution API in the REST Sandbox (T418522) (duration: 07m 41s)
  • 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P89773 and previous config saved to /var/cache/conftool/dbconfig/20260303-213423-marostegui.json
  • 21:32 jhuneidi@deploy2002: jhuneidi, bpirkle: Continuing with sync
  • 21:30 jhuneidi@deploy2002: jhuneidi, bpirkle: Backport for REST: show the beta Attribution API in the REST Sandbox (T418522) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:28 jhuneidi@deploy2002: Started scap sync-world: Backport for REST: show the beta Attribution API in the REST Sandbox (T418522)
  • 21:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P89772 and previous config saved to /var/cache/conftool/dbconfig/20260303-211915-marostegui.json
  • 21:18 jhuneidi@deploy2002: Finished scap sync-world: Backport for Remove redundant mw-extra wgRestSandboxSpecs entry (duration: 06m 56s)
  • 21:14 jhuneidi@deploy2002: jhuneidi, aaron: Continuing with sync
  • 21:13 jhuneidi@deploy2002: jhuneidi, aaron: Backport for Remove redundant mw-extra wgRestSandboxSpecs entry synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:11 jhuneidi@deploy2002: Started scap sync-world: Backport for Remove redundant mw-extra wgRestSandboxSpecs entry
  • 21:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1260 (T418465)', diff saved to https://phabricator.wikimedia.org/P89771 and previous config saved to /var/cache/conftool/dbconfig/20260303-211033-marostegui.json
  • 21:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1260.eqiad.wmnet with reason: Maintenance
  • 21:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T418465)', diff saved to https://phabricator.wikimedia.org/P89770 and previous config saved to /var/cache/conftool/dbconfig/20260303-211009-marostegui.json
  • 21:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 (T418465)', diff saved to https://phabricator.wikimedia.org/P89769 and previous config saved to /var/cache/conftool/dbconfig/20260303-210407-marostegui.json
  • 20:58 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2045.codfw.wmnet with reason: troubleshooting for T418527
  • 20:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P89768 and previous config saved to /var/cache/conftool/dbconfig/20260303-205502-marostegui.json
  • 20:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7008.magru.wmnet with OS trixie
  • 20:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2248 (T418465)', diff saved to https://phabricator.wikimedia.org/P89767 and previous config saved to /var/cache/conftool/dbconfig/20260303-204452-marostegui.json
  • 20:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2248.codfw.wmnet with reason: Maintenance
  • 20:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 (T418465)', diff saved to https://phabricator.wikimedia.org/P89766 and previous config saved to /var/cache/conftool/dbconfig/20260303-204439-marostegui.json
  • 20:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P89765 and previous config saved to /var/cache/conftool/dbconfig/20260303-203954-marostegui.json
  • 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 20:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P89764 and previous config saved to /var/cache/conftool/dbconfig/20260303-202931-marostegui.json
  • 20:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7008.magru.wmnet with reason: host reimage
  • 20:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T418465)', diff saved to https://phabricator.wikimedia.org/P89763 and previous config saved to /var/cache/conftool/dbconfig/20260303-202447-marostegui.json
  • 20:17 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7008.magru.wmnet with reason: host reimage
  • 20:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P89762 and previous config saved to /var/cache/conftool/dbconfig/20260303-201423-marostegui.json
  • 20:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1199.eqiad.wmnet
  • 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 (T418465)', diff saved to https://phabricator.wikimedia.org/P89761 and previous config saved to /var/cache/conftool/dbconfig/20260303-195916-marostegui.json
  • 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1252 (T418465)', diff saved to https://phabricator.wikimedia.org/P89760 and previous config saved to /var/cache/conftool/dbconfig/20260303-195900-marostegui.json
  • 19:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1252.eqiad.wmnet with reason: Maintenance
  • 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T418465)', diff saved to https://phabricator.wikimedia.org/P89759 and previous config saved to /var/cache/conftool/dbconfig/20260303-195835-marostegui.json
  • 19:51 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7008.magru.wmnet with OS trixie
  • 19:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P89758 and previous config saved to /var/cache/conftool/dbconfig/20260303-194327-marostegui.json
  • 19:42 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2043.codfw.wmnet
  • 19:42 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp2043.codfw.wmnet
  • 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2247 (T418465)', diff saved to https://phabricator.wikimedia.org/P89757 and previous config saved to /var/cache/conftool/dbconfig/20260303-193351-marostegui.json
  • 19:33 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2247.codfw.wmnet with reason: Maintenance
  • 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T418465)', diff saved to https://phabricator.wikimedia.org/P89756 and previous config saved to /var/cache/conftool/dbconfig/20260303-193338-marostegui.json
  • 19:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P89755 and previous config saved to /var/cache/conftool/dbconfig/20260303-192820-marostegui.json
  • 19:19 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.18 refs T413809
  • 19:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P89754 and previous config saved to /var/cache/conftool/dbconfig/20260303-191830-marostegui.json
  • 19:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2047.codfw.wmnet with OS trixie
  • 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T418465)', diff saved to https://phabricator.wikimedia.org/P89753 and previous config saved to /var/cache/conftool/dbconfig/20260303-191312-marostegui.json
  • 19:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P89752 and previous config saved to /var/cache/conftool/dbconfig/20260303-190323-marostegui.json
  • 18:53 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
  • 18:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
  • 18:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1198.eqiad.wmnet
  • 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1249 (T418465)', diff saved to https://phabricator.wikimedia.org/P89751 and previous config saved to /var/cache/conftool/dbconfig/20260303-184937-marostegui.json
  • 18:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T418465)', diff saved to https://phabricator.wikimedia.org/P89750 and previous config saved to /var/cache/conftool/dbconfig/20260303-184913-marostegui.json
  • 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T418465)', diff saved to https://phabricator.wikimedia.org/P89749 and previous config saved to /var/cache/conftool/dbconfig/20260303-184815-marostegui.json
  • 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
  • 18:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1096.eqiad.wmnet with OS bullseye
  • 18:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1198.eqiad.wmnet
  • 18:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1197.eqiad.wmnet
  • 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P89747 and previous config saved to /var/cache/conftool/dbconfig/20260303-183406-marostegui.json
  • 18:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp2047.codfw.wmnet with OS trixie
  • 18:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1197.eqiad.wmnet
  • 18:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1196.eqiad.wmnet
  • 18:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2246 (T418465)', diff saved to https://phabricator.wikimedia.org/P89746 and previous config saved to /var/cache/conftool/dbconfig/20260303-182346-marostegui.json
  • 18:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1096.eqiad.wmnet with reason: host reimage
  • 18:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2246.codfw.wmnet with reason: Maintenance
  • 18:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T418465)', diff saved to https://phabricator.wikimedia.org/P89745 and previous config saved to /var/cache/conftool/dbconfig/20260303-182321-marostegui.json
  • 18:19 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1096.eqiad.wmnet with reason: host reimage
  • 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P89744 and previous config saved to /var/cache/conftool/dbconfig/20260303-181859-marostegui.json
  • 18:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1196.eqiad.wmnet
  • 18:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1195.eqiad.wmnet
  • 18:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P89743 and previous config saved to /var/cache/conftool/dbconfig/20260303-180814-marostegui.json
  • 18:04 jforrester@deploy2002: Finished scap sync-world: Backport for Style fixes for copy-paste feature (T414072) (duration: 32m 54s)
  • 18:04 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 18:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T418465)', diff saved to https://phabricator.wikimedia.org/P89742 and previous config saved to /var/cache/conftool/dbconfig/20260303-180352-marostegui.json
  • 18:02 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1096.eqiad.wmnet with OS bullseye
  • 18:02 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 17:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1195.eqiad.wmnet
  • 17:59 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host an-worker1194.eqiad.wmnet
  • 17:55 ariel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 17:53 ariel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 17:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P89741 and previous config saved to /var/cache/conftool/dbconfig/20260303-175304-marostegui.json
  • 17:52 jforrester@deploy2002: jforrester: Continuing with sync
  • 17:51 jforrester@deploy2002: jforrester: Backport for Style fixes for copy-paste feature (T414072) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:47 ariel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 17:46 ariel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 17:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1194.eqiad.wmnet
  • 17:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1193.eqiad.wmnet
  • 17:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1248 (T418465)', diff saved to https://phabricator.wikimedia.org/P89740 and previous config saved to /var/cache/conftool/dbconfig/20260303-173914-marostegui.json
  • 17:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T418465)', diff saved to https://phabricator.wikimedia.org/P89739 and previous config saved to /var/cache/conftool/dbconfig/20260303-173850-marostegui.json
  • 17:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T418465)', diff saved to https://phabricator.wikimedia.org/P89738 and previous config saved to /var/cache/conftool/dbconfig/20260303-173756-marostegui.json
  • 17:31 jforrester@deploy2002: Started scap sync-world: Backport for Style fixes for copy-paste feature (T414072)
  • 17:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1193.eqiad.wmnet
  • 17:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1192.eqiad.wmnet
  • 17:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P89736 and previous config saved to /var/cache/conftool/dbconfig/20260303-172343-marostegui.json
  • 17:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1192.eqiad.wmnet
  • 17:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1191.eqiad.wmnet
  • 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2245 (T418465)', diff saved to https://phabricator.wikimedia.org/P89735 and previous config saved to /var/cache/conftool/dbconfig/20260303-171149-marostegui.json
  • 17:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2245.codfw.wmnet with reason: Maintenance
  • 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 (T418465)', diff saved to https://phabricator.wikimedia.org/P89734 and previous config saved to /var/cache/conftool/dbconfig/20260303-171126-marostegui.json
  • 17:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P89733 and previous config saved to /var/cache/conftool/dbconfig/20260303-170835-marostegui.json
  • 17:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1191.eqiad.wmnet
  • 17:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1190.eqiad.wmnet
  • 16:56 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1190.eqiad.wmnet
  • 16:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P89732 and previous config saved to /var/cache/conftool/dbconfig/20260303-165618-marostegui.json
  • 16:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T418465)', diff saved to https://phabricator.wikimedia.org/P89731 and previous config saved to /var/cache/conftool/dbconfig/20260303-165327-marostegui.json
  • 16:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1189.eqiad.wmnet
  • 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P89730 and previous config saved to /var/cache/conftool/dbconfig/20260303-164111-marostegui.json
  • 16:34 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 16:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1189.eqiad.wmnet
  • 16:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1188.eqiad.wmnet
  • 16:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1247 (T418465)', diff saved to https://phabricator.wikimedia.org/P89729 and previous config saved to /var/cache/conftool/dbconfig/20260303-162845-marostegui.json
  • 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Setting x1 codfw weights to 300 T416705', diff saved to https://phabricator.wikimedia.org/P89728 and previous config saved to /var/cache/conftool/dbconfig/20260303-162836-fceratto.json
  • 16:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 (T418465)', diff saved to https://phabricator.wikimedia.org/P89727 and previous config saved to /var/cache/conftool/dbconfig/20260303-162603-marostegui.json
  • 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
  • 16:18 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1188 weight to 100 T416705', diff saved to https://phabricator.wikimedia.org/P89726 and previous config saved to /var/cache/conftool/dbconfig/20260303-161846-fceratto.json
  • 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
  • 16:17 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1188.eqiad.wmnet
  • 16:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1187.eqiad.wmnet
  • 16:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1166: testing:crash
  • 16:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1166: testing:crash
  • 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1169 weight to 300 T416705', diff saved to https://phabricator.wikimedia.org/P89724 and previous config saved to /var/cache/conftool/dbconfig/20260303-161323-fceratto.json
  • 16:12 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1188 weight to 300 T416705', diff saved to https://phabricator.wikimedia.org/P89723 and previous config saved to /var/cache/conftool/dbconfig/20260303-161230-fceratto.json
  • 16:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 16:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T418465)', diff saved to https://phabricator.wikimedia.org/P89722 and previous config saved to /var/cache/conftool/dbconfig/20260303-160720-marostegui.json
  • 16:07 brennen@deploy2002: Finished deploy [phabricator/deployment@a883b6d]: deploy phab1004 for T418872 (duration: 01m 07s)
  • 16:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1187.eqiad.wmnet
  • 16:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1186.eqiad.wmnet
  • 16:05 brennen@deploy2002: Started deploy [phabricator/deployment@a883b6d]: deploy phab1004 for T418872
  • 16:05 brennen@deploy2002: Finished deploy [phabricator/deployment@a883b6d]: deploy phab2002 for T418872 (duration: 00m 32s)
  • 16:04 brennen@deploy2002: Started deploy [phabricator/deployment@a883b6d]: deploy phab2002 for T418872
  • 16:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2240 (T418465)', diff saved to https://phabricator.wikimedia.org/P89721 and previous config saved to /var/cache/conftool/dbconfig/20260303-160207-marostegui.json
  • 16:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
  • 16:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2240.codfw.wmnet with reason: Maintenance
  • 16:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
  • 16:00 zabe@deploy2002: Finished scap sync-world: Backport for ImageListPager: Use correct name field for batch lookups (T418327) (duration: 09m 28s)
  • 15:54 zabe@deploy2002: zabe: Continuing with sync
  • 15:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1186.eqiad.wmnet
  • 15:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1185.eqiad.wmnet
  • 15:54 zabe@deploy2002: zabe: Backport for ImageListPager: Use correct name field for batch lookups (T418327) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:53 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
  • 15:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P89720 and previous config saved to /var/cache/conftool/dbconfig/20260303-155212-marostegui.json
  • 15:50 zabe@deploy2002: Started scap sync-world: Backport for ImageListPager: Use correct name field for batch lookups (T418327)
  • 15:49 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
  • 15:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1185.eqiad.wmnet
  • 15:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1184.eqiad.wmnet
  • 15:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 15:41 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 15:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2239.codfw.wmnet with reason: Maintenance
  • 15:41 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 15:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T418465)', diff saved to https://phabricator.wikimedia.org/P89719 and previous config saved to /var/cache/conftool/dbconfig/20260303-154104-marostegui.json
  • 15:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P89718 and previous config saved to /var/cache/conftool/dbconfig/20260303-153704-marostegui.json
  • 15:36 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1359].eqiad.wmnet
  • 15:36 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1359].eqiad.wmnet
  • 15:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1184.eqiad.wmnet
  • 15:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1183.eqiad.wmnet
  • 15:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P89717 and previous config saved to /var/cache/conftool/dbconfig/20260303-152557-marostegui.json
  • 15:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
  • 15:22 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp5032.*} and A:cp - 3.0 upgrade ()
  • 15:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T418465)', diff saved to https://phabricator.wikimedia.org/P89716 and previous config saved to /var/cache/conftool/dbconfig/20260303-152157-marostegui.json
  • 15:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1183.eqiad.wmnet
  • 15:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1182.eqiad.wmnet
  • 15:16 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp5032.*} and A:cp - 3.0 upgrade ()
  • 15:15 fabfur@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=1) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
  • 15:14 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
  • 15:14 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
  • 15:13 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
  • 15:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
  • 15:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P89715 and previous config saved to /var/cache/conftool/dbconfig/20260303-151049-marostegui.json
  • 15:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 15:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1182.eqiad.wmnet
  • 15:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1181.eqiad.wmnet
  • 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1244 (T418465)', diff saved to https://phabricator.wikimedia.org/P89714 and previous config saved to /var/cache/conftool/dbconfig/20260303-145727-marostegui.json
  • 14:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T418465)', diff saved to https://phabricator.wikimedia.org/P89713 and previous config saved to /var/cache/conftool/dbconfig/20260303-145704-marostegui.json
  • 14:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T418465)', diff saved to https://phabricator.wikimedia.org/P89712 and previous config saved to /var/cache/conftool/dbconfig/20260303-145541-marostegui.json
  • 14:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1181.eqiad.wmnet
  • 14:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1180.eqiad.wmnet
  • 14:49 moritzm: installing php7.4 security updates
  • 14:46 jayme@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1352-1359].eqiad.wmnet
  • 14:46 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1359].eqiad.wmnet
  • 14:43 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1180.eqiad.wmnet
  • 14:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1179.eqiad.wmnet
  • 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P89711 and previous config saved to /var/cache/conftool/dbconfig/20260303-144156-marostegui.json
  • 14:38 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 14:38 esanders@deploy2002: Finished scap sync-world: Backport for Remove Editing-related config for special wikis (T400063) (duration: 06m 34s)
  • 14:36 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 14:34 esanders@deploy2002: esanders: Continuing with sync
  • 14:34 esanders@deploy2002: esanders: Backport for Remove Editing-related config for special wikis (T400063) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:34 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 14:32 esanders@deploy2002: Started scap sync-world: Backport for Remove Editing-related config for special wikis (T400063)
  • 14:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1179.eqiad.wmnet
  • 14:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1178.eqiad.wmnet
  • 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2237 (T418465)', diff saved to https://phabricator.wikimedia.org/P89710 and previous config saved to /var/cache/conftool/dbconfig/20260303-143141-marostegui.json
  • 14:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2237.codfw.wmnet with reason: Maintenance
  • 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T418465)', diff saved to https://phabricator.wikimedia.org/P89709 and previous config saved to /var/cache/conftool/dbconfig/20260303-143117-marostegui.json
  • 14:29 esanders@deploy2002: Finished scap sync-world: Backport for PasteCheck: Enable by default (T405127) (duration: 08m 01s)
  • 14:27 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
  • 14:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
  • 14:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P89708 and previous config saved to /var/cache/conftool/dbconfig/20260303-142649-marostegui.json
  • 14:26 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
  • 14:25 esanders@deploy2002: esanders: Continuing with sync
  • 14:23 esanders@deploy2002: esanders: Backport for PasteCheck: Enable by default (T405127) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:21 esanders@deploy2002: Started scap sync-world: Backport for PasteCheck: Enable by default (T405127)
  • 14:20 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
  • 14:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P89707 and previous config saved to /var/cache/conftool/dbconfig/20260303-141610-marostegui.json
  • 14:15 esanders@deploy2002: Finished scap sync-world: Backport for Enable Wikibase GraphQL on test.wikidata.org (T417619), Enable Wikibase GraphQL on production wikidata.org (T417619) (duration: 08m 17s)
  • 14:11 esanders@deploy2002: esanders, jakob: Continuing with sync
  • 14:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T418465)', diff saved to https://phabricator.wikimedia.org/P89706 and previous config saved to /var/cache/conftool/dbconfig/20260303-141142-marostegui.json
  • 14:09 esanders@deploy2002: esanders, jakob: Backport for Enable Wikibase GraphQL on test.wikidata.org (T417619), Enable Wikibase GraphQL on production wikidata.org (T417619) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:07 esanders@deploy2002: Started scap sync-world: Backport for Enable Wikibase GraphQL on test.wikidata.org (T417619), Enable Wikibase GraphQL on production wikidata.org (T417619)
  • 14:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P89704 and previous config saved to /var/cache/conftool/dbconfig/20260303-140102-marostegui.json
  • 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1243 (T418465)', diff saved to https://phabricator.wikimedia.org/P89703 and previous config saved to /var/cache/conftool/dbconfig/20260303-134702-marostegui.json
  • 13:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 13:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T418465)', diff saved to https://phabricator.wikimedia.org/P89702 and previous config saved to /var/cache/conftool/dbconfig/20260303-134639-marostegui.json
  • 13:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T418465)', diff saved to https://phabricator.wikimedia.org/P89701 and previous config saved to /var/cache/conftool/dbconfig/20260303-134554-marostegui.json
  • 13:31 moritzm: installing NSS security updates
  • 13:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P89700 and previous config saved to /var/cache/conftool/dbconfig/20260303-133131-marostegui.json
  • 13:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2236 (T418465)', diff saved to https://phabricator.wikimedia.org/P89699 and previous config saved to /var/cache/conftool/dbconfig/20260303-132414-marostegui.json
  • 13:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2236.codfw.wmnet with reason: Maintenance
  • 13:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T418465)', diff saved to https://phabricator.wikimedia.org/P89698 and previous config saved to /var/cache/conftool/dbconfig/20260303-132350-marostegui.json
  • 13:20 tappof: Thanos: re-enable querier<->ruler cross-site traffic T412924
  • 13:17 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=recommendation-api,name=eqiad
  • 13:17 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
  • 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P89697 and previous config saved to /var/cache/conftool/dbconfig/20260303-131624-marostegui.json
  • 13:16 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
  • 13:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
  • 13:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1177.eqiad.wmnet
  • 13:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1359.eqiad.wmnet with OS trixie
  • 13:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P89696 and previous config saved to /var/cache/conftool/dbconfig/20260303-130842-marostegui.json
  • 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T418465)', diff saved to https://phabricator.wikimedia.org/P89695 and previous config saved to /var/cache/conftool/dbconfig/20260303-130117-marostegui.json
  • 13:01 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 13:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1177.eqiad.wmnet
  • 13:00 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 13:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1176.eqiad.wmnet
  • 12:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1358.eqiad.wmnet with OS trixie
  • 12:56 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 12:55 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 12:53 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1359.eqiad.wmnet with reason: host reimage
  • 12:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P89694 and previous config saved to /var/cache/conftool/dbconfig/20260303-125335-marostegui.json
  • 12:52 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1357.eqiad.wmnet with OS trixie
  • 12:51 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 12:50 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 12:48 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1359.eqiad.wmnet with reason: host reimage
  • 12:48 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1356.eqiad.wmnet with OS trixie
  • 12:47 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 12:47 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 12:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1176.eqiad.wmnet
  • 12:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1175.eqiad.wmnet
  • 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 12:45 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 12:45 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
  • 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'ores-legacy' for release 'main' .
  • 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
  • 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
  • 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 12:43 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1358.eqiad.wmnet with reason: host reimage
  • 12:42 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 12:42 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 12:41 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 12:40 ladsgroup@deploy2002: Finished scap sync-world: Backport for Enable thumb steps on private wikis too (T414805) (duration: 13m 01s)
  • 12:39 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 12:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T418465)', diff saved to https://phabricator.wikimedia.org/P89693 and previous config saved to /var/cache/conftool/dbconfig/20260303-123827-marostegui.json
  • 12:36 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1357.eqiad.wmnet with reason: host reimage
  • 12:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1242 (T418465)', diff saved to https://phabricator.wikimedia.org/P89692 and previous config saved to /var/cache/conftool/dbconfig/20260303-123642-marostegui.json
  • 12:36 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1359.eqiad.wmnet with OS trixie
  • 12:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 12:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T418465)', diff saved to https://phabricator.wikimedia.org/P89691 and previous config saved to /var/cache/conftool/dbconfig/20260303-123619-marostegui.json
  • 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1175.eqiad.wmnet
  • 12:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1174.eqiad.wmnet
  • 12:34 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=recommendation-api,name=eqiad
  • 12:33 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 12:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1356.eqiad.wmnet with reason: host reimage
  • 12:31 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 12:31 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
  • 12:31 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 12:31 ladsgroup@deploy2002: ladsgroup: Backport for Enable thumb steps on private wikis too (T414805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 12:30 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
  • 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1358.eqiad.wmnet with reason: host reimage
  • 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1357.eqiad.wmnet with reason: host reimage
  • 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1356.eqiad.wmnet with reason: host reimage
  • 12:27 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 12:27 ladsgroup@deploy2002: Started scap sync-world: Backport for Enable thumb steps on private wikis too (T414805)
  • 12:26 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 12:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1174.eqiad.wmnet
  • 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1173.eqiad.wmnet
  • 12:21 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P89690 and previous config saved to /var/cache/conftool/dbconfig/20260303-122112-marostegui.json
  • 12:20 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 12:20 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 12:19 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1353.eqiad.wmnet with OS trixie
  • 12:16 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1358.eqiad.wmnet with OS trixie
  • 12:16 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1357.eqiad.wmnet with OS trixie
  • 12:15 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1356.eqiad.wmnet with OS trixie
  • 12:14 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1355.eqiad.wmnet with OS trixie
  • 12:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2219 (T418465)', diff saved to https://phabricator.wikimedia.org/P89689 and previous config saved to /var/cache/conftool/dbconfig/20260303-121420-marostegui.json
  • 12:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2219.codfw.wmnet with reason: Maintenance
  • 12:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T418465)', diff saved to https://phabricator.wikimedia.org/P89688 and previous config saved to /var/cache/conftool/dbconfig/20260303-121355-marostegui.json
  • 12:09 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1354.eqiad.wmnet with OS trixie
  • 12:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1173.eqiad.wmnet
  • 12:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1172.eqiad.wmnet
  • 12:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P89687 and previous config saved to /var/cache/conftool/dbconfig/20260303-120604-marostegui.json
  • 12:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1352.eqiad.wmnet with OS trixie
  • 12:02 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1353.eqiad.wmnet with reason: host reimage
  • 11:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P89686 and previous config saved to /var/cache/conftool/dbconfig/20260303-115847-marostegui.json
  • 11:58 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1355.eqiad.wmnet with reason: host reimage
  • 11:52 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1354.eqiad.wmnet with reason: host reimage
  • 11:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T418465)', diff saved to https://phabricator.wikimedia.org/P89685 and previous config saved to /var/cache/conftool/dbconfig/20260303-115057-marostegui.json
  • 11:48 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1352.eqiad.wmnet with reason: host reimage
  • 11:44 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1355.eqiad.wmnet with reason: host reimage
  • 11:43 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1354.eqiad.wmnet with reason: host reimage
  • 11:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P89684 and previous config saved to /var/cache/conftool/dbconfig/20260303-114341-marostegui.json
  • 11:43 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1353.eqiad.wmnet with reason: host reimage
  • 11:42 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1352.eqiad.wmnet with reason: host reimage
  • 11:40 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
  • 11:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
  • 11:31 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1355.eqiad.wmnet with OS trixie
  • 11:31 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1354.eqiad.wmnet with OS trixie
  • 11:30 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1353.eqiad.wmnet with OS trixie
  • 11:30 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1352.eqiad.wmnet with OS trixie
  • 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T418465)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260303-112828-marostegui.json
  • 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1241 (T418465)', diff saved to https://phabricator.wikimedia.org/P89683 and previous config saved to /var/cache/conftool/dbconfig/20260303-112535-marostegui.json
  • 11:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T418465)', diff saved to https://phabricator.wikimedia.org/P89682 and previous config saved to /var/cache/conftool/dbconfig/20260303-112511-marostegui.json
  • 11:21 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 11:18 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
  • 11:18 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:17 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:17 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 11:16 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1350-1351].eqiad.wmnet
  • 11:16 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1350-1351].eqiad.wmnet
  • 11:15 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 11:15 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:15 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 11:15 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:14 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 11:14 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 11:13 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 11:13 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1172.eqiad.wmnet
  • 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1171.eqiad.wmnet
  • 11:13 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 11:13 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 11:12 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 11:11 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 11:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P89681 and previous config saved to /var/cache/conftool/dbconfig/20260303-111003-marostegui.json
  • 11:09 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:08 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:08 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:07 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 11:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 11:06 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 11:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2210 (T418465)', diff saved to https://phabricator.wikimedia.org/P89680 and previous config saved to /var/cache/conftool/dbconfig/20260303-110551-marostegui.json
  • 11:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2210.codfw.wmnet with reason: Maintenance
  • 11:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T418465)', diff saved to https://phabricator.wikimedia.org/P89679 and previous config saved to /var/cache/conftool/dbconfig/20260303-110527-marostegui.json
  • 10:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1171.eqiad.wmnet
  • 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1170.eqiad.wmnet
  • 10:57 slyngshede@dns1004: END - running authdns-update
  • 10:55 slyngshede@dns1004: START - running authdns-update
  • 10:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P89678 and previous config saved to /var/cache/conftool/dbconfig/20260303-105455-marostegui.json
  • 10:54 hashar@deploy2002: Finished deploy [gerrit/gerrit@12177b1]: wm-checks-api: add tag for Selenium jobs (duration: 00m 13s)
  • 10:54 hashar@deploy2002: Started deploy [gerrit/gerrit@12177b1]: wm-checks-api: add tag for Selenium jobs
  • 10:51 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
  • 10:51 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
  • 10:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P89677 and previous config saved to /var/cache/conftool/dbconfig/20260303-105020-marostegui.json
  • 10:47 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1170.eqiad.wmnet
  • 10:45 fabfur: start upgrading haproxy to 3.0 on A:cp-eqsin (T417253)
  • 10:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 10:41 moritzm: installing Django security updates
  • 10:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T418465)', diff saved to https://phabricator.wikimedia.org/P89676 and previous config saved to /var/cache/conftool/dbconfig/20260303-103947-marostegui.json
  • 10:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P89675 and previous config saved to /var/cache/conftool/dbconfig/20260303-103512-marostegui.json
  • 10:34 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 10:33 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 10:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 10:31 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 10:25 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 10:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T418465)', diff saved to https://phabricator.wikimedia.org/P89674 and previous config saved to /var/cache/conftool/dbconfig/20260303-102004-marostegui.json
  • 10:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1238 (T418465)', diff saved to https://phabricator.wikimedia.org/P89673 and previous config saved to /var/cache/conftool/dbconfig/20260303-101800-marostegui.json
  • 10:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T418465)', diff saved to https://phabricator.wikimedia.org/P89672 and previous config saved to /var/cache/conftool/dbconfig/20260303-101747-marostegui.json
  • 09:57 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2206 (T418465)', diff saved to https://phabricator.wikimedia.org/P89670 and previous config saved to /var/cache/conftool/dbconfig/20260303-095655-marostegui.json
  • 09:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2206.codfw.wmnet with reason: Maintenance
  • 09:53 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 09:51 moritzm: installing qemu security updates
  • 09:48 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 09:48 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
  • 09:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P89669 and previous config saved to /var/cache/conftool/dbconfig/20260303-094732-marostegui.json
  • 09:47 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 09:47 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 09:45 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 09:45 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
  • 09:44 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
  • 09:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
  • 09:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
  • 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
  • 09:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
  • 09:40 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:38 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
  • 09:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2199.codfw.wmnet with reason: Maintenance
  • 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T418465)', diff saved to https://phabricator.wikimedia.org/P89668 and previous config saved to /var/cache/conftool/dbconfig/20260303-093542-marostegui.json
  • 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T418465)', diff saved to https://phabricator.wikimedia.org/P89667 and previous config saved to /var/cache/conftool/dbconfig/20260303-093224-marostegui.json
  • 09:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 09:23 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
  • 09:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1176.eqiad.wmnet with OS trixie
  • 09:21 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
  • 09:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
  • 09:20 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
  • 09:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P89666 and previous config saved to /var/cache/conftool/dbconfig/20260303-092034-marostegui.json
  • 09:19 arnaudb@dns1004: END - running authdns-update
  • 09:18 arnaudb@dns1004: START - running authdns-update
  • 09:17 moritzm: installing libbpf updates from Bookworm point release
  • 09:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1221 (T418465)', diff saved to https://phabricator.wikimedia.org/P89665 and previous config saved to /var/cache/conftool/dbconfig/20260303-090818-marostegui.json
  • 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on 6 hosts with reason: Maintenance
  • 09:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T418465)', diff saved to https://phabricator.wikimedia.org/P89664 and previous config saved to /var/cache/conftool/dbconfig/20260303-090731-marostegui.json
  • 09:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P89663 and previous config saved to /var/cache/conftool/dbconfig/20260303-090526-marostegui.json
  • 08:54 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
  • 08:53 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
  • 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P89662 and previous config saved to /var/cache/conftool/dbconfig/20260303-085224-marostegui.json
  • 08:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T418465)', diff saved to https://phabricator.wikimedia.org/P89661 and previous config saved to /var/cache/conftool/dbconfig/20260303-085019-marostegui.json
  • 08:47 moritzm: powercycling lvs1013
  • 08:41 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
  • 08:41 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
  • 08:37 fabfur: start upgrading haproxy to 3.0 on A:cp-ulsfo (T417253)
  • 08:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P89660 and previous config saved to /var/cache/conftool/dbconfig/20260303-083716-marostegui.json
  • 08:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 08:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 08:31 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 08:30 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 08:28 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
  • 08:27 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
  • 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2172 (T418465)', diff saved to https://phabricator.wikimedia.org/P89659 and previous config saved to /var/cache/conftool/dbconfig/20260303-082424-marostegui.json
  • 08:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T418465)', diff saved to https://phabricator.wikimedia.org/P89658 and previous config saved to /var/cache/conftool/dbconfig/20260303-082400-marostegui.json
  • 08:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T418465)', diff saved to https://phabricator.wikimedia.org/P89657 and previous config saved to /var/cache/conftool/dbconfig/20260303-082209-marostegui.json
  • 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P89656 and previous config saved to /var/cache/conftool/dbconfig/20260303-080853-marostegui.json
  • 08:07 moritzm: installing PAM security updates on Bookworm
  • 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1199 (T418465)', diff saved to https://phabricator.wikimedia.org/P89655 and previous config saved to /var/cache/conftool/dbconfig/20260303-075526-marostegui.json
  • 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T418465)', diff saved to https://phabricator.wikimedia.org/P89654 and previous config saved to /var/cache/conftool/dbconfig/20260303-075502-marostegui.json
  • 07:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P89653 and previous config saved to /var/cache/conftool/dbconfig/20260303-075345-marostegui.json
  • 07:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P89652 and previous config saved to /var/cache/conftool/dbconfig/20260303-073955-marostegui.json
  • 07:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T418465)', diff saved to https://phabricator.wikimedia.org/P89651 and previous config saved to /var/cache/conftool/dbconfig/20260303-073838-marostegui.json
  • 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P89650 and previous config saved to /var/cache/conftool/dbconfig/20260303-072447-marostegui.json
  • 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
  • 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
  • 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2155 (T418465)', diff saved to https://phabricator.wikimedia.org/P89649 and previous config saved to /var/cache/conftool/dbconfig/20260303-071054-marostegui.json
  • 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T418465)', diff saved to https://phabricator.wikimedia.org/P89648 and previous config saved to /var/cache/conftool/dbconfig/20260303-071029-marostegui.json
  • 07:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T418465)', diff saved to https://phabricator.wikimedia.org/P89647 and previous config saved to /var/cache/conftool/dbconfig/20260303-070940-marostegui.json
  • 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P89646 and previous config saved to /var/cache/conftool/dbconfig/20260303-065523-marostegui.json
  • 06:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1190 (T418465)', diff saved to https://phabricator.wikimedia.org/P89645 and previous config saved to /var/cache/conftool/dbconfig/20260303-064405-marostegui.json
  • 06:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P89644 and previous config saved to /var/cache/conftool/dbconfig/20260303-064015-marostegui.json
  • 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2240 gradually with 4 steps - repool after schema change
  • 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T418465)', diff saved to https://phabricator.wikimedia.org/P89642 and previous config saved to /var/cache/conftool/dbconfig/20260303-062507-marostegui.json
  • 05:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2147 (T418465)', diff saved to https://phabricator.wikimedia.org/P89639 and previous config saved to /var/cache/conftool/dbconfig/20260303-055834-marostegui.json
  • 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 05:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2212.codfw.wmnet with reason: Maintenance
  • 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2240 gradually with 4 steps - repool after schema change
  • 05:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 05:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.15 (duration: 01m 10s)
  • 04:43 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.18 refs T413809 (duration: 39m 43s)
  • 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.18 refs T413809
  • 03:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 03:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T418465)', diff saved to https://phabricator.wikimedia.org/P89637 and previous config saved to /var/cache/conftool/dbconfig/20260303-035746-marostegui.json
  • 03:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P89636 and previous config saved to /var/cache/conftool/dbconfig/20260303-034239-marostegui.json
  • 03:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P89635 and previous config saved to /var/cache/conftool/dbconfig/20260303-032731-marostegui.json
  • 03:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T418465)', diff saved to https://phabricator.wikimedia.org/P89634 and previous config saved to /var/cache/conftool/dbconfig/20260303-031224-marostegui.json
  • 03:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1251 (T418465)', diff saved to https://phabricator.wikimedia.org/P89633 and previous config saved to /var/cache/conftool/dbconfig/20260303-030217-marostegui.json
  • 03:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1251.eqiad.wmnet with reason: Maintenance
  • 02:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 00s)
  • 02:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 02:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T418465)', diff saved to https://phabricator.wikimedia.org/P89632 and previous config saved to /var/cache/conftool/dbconfig/20260303-020817-marostegui.json
  • 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 01:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P89631 and previous config saved to /var/cache/conftool/dbconfig/20260303-015309-marostegui.json
  • 01:42 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog2003.codfw.wmnet with OS trixie
  • 01:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P89630 and previous config saved to /var/cache/conftool/dbconfig/20260303-013802-marostegui.json
  • 01:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T418465)', diff saved to https://phabricator.wikimedia.org/P89629 and previous config saved to /var/cache/conftool/dbconfig/20260303-013719-marostegui.json
  • 01:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T418465)', diff saved to https://phabricator.wikimedia.org/P89628 and previous config saved to /var/cache/conftool/dbconfig/20260303-012254-marostegui.json
  • 01:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P89627 and previous config saved to /var/cache/conftool/dbconfig/20260303-012211-marostegui.json
  • 01:19 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog2003.codfw.wmnet with reason: host reimage
  • 01:11 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog2003.codfw.wmnet with reason: host reimage
  • 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1235 (T418465)', diff saved to https://phabricator.wikimedia.org/P89626 and previous config saved to /var/cache/conftool/dbconfig/20260303-011151-marostegui.json
  • 01:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T418465)', diff saved to https://phabricator.wikimedia.org/P89625 and previous config saved to /var/cache/conftool/dbconfig/20260303-011128-marostegui.json
  • 01:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P89624 and previous config saved to /var/cache/conftool/dbconfig/20260303-010703-marostegui.json
  • 00:59 zabe@deploy2002: Finished scap sync-world: Backport for Revert "ImageListPager: Properly support file schema migration read new" (duration: 08m 12s)
  • 00:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P89623 and previous config saved to /var/cache/conftool/dbconfig/20260303-005620-marostegui.json
  • 00:56 zabe@deploy2002: zabe: Continuing with sync
  • 00:54 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog2003.codfw.wmnet with OS trixie
  • 00:53 zabe@deploy2002: zabe: Backport for Revert "ImageListPager: Properly support file schema migration read new" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 00:53 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mwlog2003.codfw.wmnet with OS trixie
  • 00:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T418465)', diff saved to https://phabricator.wikimedia.org/P89622 and previous config saved to /var/cache/conftool/dbconfig/20260303-005156-marostegui.json
  • 00:51 zabe@deploy2002: Started scap sync-world: Backport for Revert "ImageListPager: Properly support file schema migration read new"
  • 00:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P89621 and previous config saved to /var/cache/conftool/dbconfig/20260303-004112-marostegui.json
  • 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2216 (T418465)', diff saved to https://phabricator.wikimedia.org/P89620 and previous config saved to /var/cache/conftool/dbconfig/20260303-004056-marostegui.json
  • 00:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2216.codfw.wmnet with reason: Maintenance
  • 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 (T418465)', diff saved to https://phabricator.wikimedia.org/P89619 and previous config saved to /var/cache/conftool/dbconfig/20260303-004033-marostegui.json
  • 00:31 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog1003.eqiad.wmnet with OS trixie
  • 00:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T418465)', diff saved to https://phabricator.wikimedia.org/P89618 and previous config saved to /var/cache/conftool/dbconfig/20260303-002604-marostegui.json
  • 00:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P89617 and previous config saved to /var/cache/conftool/dbconfig/20260303-002525-marostegui.json
  • 00:20 zabe@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
  • 00:18 zabe@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
  • 00:18 zabe@deploy2002: Finished scap sync-world: T418327 (duration: 05m 01s)
  • 00:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1234 (T418465)', diff saved to https://phabricator.wikimedia.org/P89616 and previous config saved to /var/cache/conftool/dbconfig/20260303-001504-marostegui.json
  • 00:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 00:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T418465)', diff saved to https://phabricator.wikimedia.org/P89615 and previous config saved to /var/cache/conftool/dbconfig/20260303-001440-marostegui.json
  • 00:13 zabe@deploy2002: Started scap sync-world: T418327
  • 00:11 zabe@deploy2002: zabe: Continuing with sync
  • 00:10 zabe@deploy2002: zabe: Backport for ImageListPager: Properly support file schema migration read new (T418327) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 00:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P89614 and previous config saved to /var/cache/conftool/dbconfig/20260303-001018-marostegui.json
  • 00:08 zabe@deploy2002: Started scap sync-world: Backport for ImageListPager: Properly support file schema migration read new (T418327)

2026-03-02

  • 23:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P89613 and previous config saved to /var/cache/conftool/dbconfig/20260302-235933-marostegui.json
  • 23:58 zabe@deploy2002: Finished scap sync-world: Backport for Stop writing to il_to on testwiki (T415787) (duration: 06m 02s)
  • 23:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 (T418465)', diff saved to https://phabricator.wikimedia.org/P89612 and previous config saved to /var/cache/conftool/dbconfig/20260302-235511-marostegui.json
  • 23:54 zabe@deploy2002: zabe: Continuing with sync
  • 23:53 zabe@deploy2002: zabe: Backport for Stop writing to il_to on testwiki (T415787) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:52 zabe@deploy2002: Started scap sync-world: Backport for Stop writing to il_to on testwiki (T415787)
  • 23:51 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp2058.codfw.wmnet with reason: dcops troubleshooting for T418527
  • 23:50 zabe@deploy2002: Finished scap sync-world: Backport for multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460) (duration: 07m 10s)
  • 23:47 zabe@deploy2002: zabe: Continuing with sync
  • 23:45 zabe@deploy2002: zabe: Backport for multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P89611 and previous config saved to /var/cache/conftool/dbconfig/20260302-234425-marostegui.json
  • 23:44 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog2003.codfw.wmnet with OS trixie
  • 23:43 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2203 (T418465)', diff saved to https://phabricator.wikimedia.org/P89610 and previous config saved to /var/cache/conftool/dbconfig/20260302-234350-marostegui.json
  • 23:43 zabe@deploy2002: Started scap sync-world: Backport for multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)
  • 23:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2203.codfw.wmnet with reason: Maintenance
  • 23:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2202.codfw.wmnet with reason: Maintenance
  • 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T418465)', diff saved to https://phabricator.wikimedia.org/P89609 and previous config saved to /var/cache/conftool/dbconfig/20260302-233517-marostegui.json
  • 23:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T418465)', diff saved to https://phabricator.wikimedia.org/P89608 and previous config saved to /var/cache/conftool/dbconfig/20260302-232918-marostegui.json
  • 23:25 dwisehaupt@dns1006: END - running authdns-update
  • 23:24 dwisehaupt@dns1006: START - running authdns-update
  • 23:23 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog1003.eqiad.wmnet with reason: host reimage
  • 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P89607 and previous config saved to /var/cache/conftool/dbconfig/20260302-232009-marostegui.json
  • 23:18 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog1003.eqiad.wmnet with reason: host reimage
  • 23:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1232 (T418465)', diff saved to https://phabricator.wikimedia.org/P89606 and previous config saved to /var/cache/conftool/dbconfig/20260302-231723-marostegui.json
  • 23:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 23:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T418465)', diff saved to https://phabricator.wikimedia.org/P89605 and previous config saved to /var/cache/conftool/dbconfig/20260302-231658-marostegui.json
  • 23:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P89604 and previous config saved to /var/cache/conftool/dbconfig/20260302-230502-marostegui.json
  • 23:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P89603 and previous config saved to /var/cache/conftool/dbconfig/20260302-230151-marostegui.json
  • 22:57 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog1003.eqiad.wmnet with OS trixie
  • 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T418465)', diff saved to https://phabricator.wikimedia.org/P89602 and previous config saved to /var/cache/conftool/dbconfig/20260302-224954-marostegui.json
  • 22:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P89601 and previous config saved to /var/cache/conftool/dbconfig/20260302-224643-marostegui.json
  • 22:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2188 (T418465)', diff saved to https://phabricator.wikimedia.org/P89600 and previous config saved to /var/cache/conftool/dbconfig/20260302-223612-marostegui.json
  • 22:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 22:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T418465)', diff saved to https://phabricator.wikimedia.org/P89599 and previous config saved to /var/cache/conftool/dbconfig/20260302-223548-marostegui.json
  • 22:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T418465)', diff saved to https://phabricator.wikimedia.org/P89598 and previous config saved to /var/cache/conftool/dbconfig/20260302-223135-marostegui.json
  • 22:21 maryum: Deployed security fix for T418179
  • 22:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P89597 and previous config saved to /var/cache/conftool/dbconfig/20260302-222041-marostegui.json
  • 22:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1219 (T418465)', diff saved to https://phabricator.wikimedia.org/P89596 and previous config saved to /var/cache/conftool/dbconfig/20260302-221938-marostegui.json
  • 22:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 22:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T418465)', diff saved to https://phabricator.wikimedia.org/P89595 and previous config saved to /var/cache/conftool/dbconfig/20260302-221925-marostegui.json
  • 22:10 aaron@deploy2002: Finished scap sync-world: Backport for Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470) (duration: 06m 39s)
  • 22:06 aaron@deploy2002: aaron: Continuing with sync
  • 22:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P89594 and previous config saved to /var/cache/conftool/dbconfig/20260302-220533-marostegui.json
  • 22:05 aaron@deploy2002: aaron: Backport for Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 22:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P89593 and previous config saved to /var/cache/conftool/dbconfig/20260302-220418-marostegui.json
  • 22:03 aaron@deploy2002: Started scap sync-world: Backport for Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)
  • 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup2003.codfw.wmnet with OS trixie
  • 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup2004.codfw.wmnet with OS trixie
  • 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 22:03 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 22:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 22:01 catrope@deploy2002: Finished scap sync-world: Backport for ApiCSPReport: Use structured logging for CSP reports (duration: 08m 19s)
  • 21:57 catrope@deploy2002: catrope: Continuing with sync
  • 21:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 21:55 catrope@deploy2002: catrope: Backport for ApiCSPReport: Use structured logging for CSP reports synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:53 catrope@deploy2002: Started scap sync-world: Backport for ApiCSPReport: Use structured logging for CSP reports
  • 21:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T418465)', diff saved to https://phabricator.wikimedia.org/P89592 and previous config saved to /var/cache/conftool/dbconfig/20260302-215025-marostegui.json
  • 21:50 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2043.codfw.wmnet with reason: These are test instances, failing should not notif
  • 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P89591 and previous config saved to /var/cache/conftool/dbconfig/20260302-214910-marostegui.json
  • 21:48 inflatador: bking@desktop restarting wdqs codfw to clear ProbeDown alerts
  • 21:43 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cp2043.codfw.wmnet
  • 21:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup2004.codfw.wmnet with reason: host reimage
  • 21:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2176 (T418465)', diff saved to https://phabricator.wikimedia.org/P89590 and previous config saved to /var/cache/conftool/dbconfig/20260302-213957-marostegui.json
  • 21:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 21:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T418465)', diff saved to https://phabricator.wikimedia.org/P89589 and previous config saved to /var/cache/conftool/dbconfig/20260302-213934-marostegui.json
  • 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup2003.codfw.wmnet with reason: host reimage
  • 21:36 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Testing removal of OpenJDK 8 support - eevans@cumin1003
  • 21:34 catrope@deploy2002: Finished scap sync-world: Backport for Add Comments namespace for shnwikinews (T414403) (duration: 07m 07s)
  • 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T418465)', diff saved to https://phabricator.wikimedia.org/P89588 and previous config saved to /var/cache/conftool/dbconfig/20260302-213402-marostegui.json
  • 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup2004.codfw.wmnet with reason: host reimage
  • 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup2003.codfw.wmnet with reason: host reimage
  • 21:30 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp2043.codfw.wmnet
  • 21:30 catrope@deploy2002: shivaanshsingh, catrope: Continuing with sync
  • 21:29 catrope@deploy2002: shivaanshsingh, catrope: Backport for Add Comments namespace for shnwikinews (T414403) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:27 catrope@deploy2002: Started scap sync-world: Backport for Add Comments namespace for shnwikinews (T414403)
  • 21:24 kemayo@deploy2002: Finished scap sync-world: Backport for Suggestion Mode: add values for suggestion feedback properties (T401739), Stop PasteCheck A/B test (T417429) (duration: 10m 55s)
  • 21:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P89587 and previous config saved to /var/cache/conftool/dbconfig/20260302-212426-marostegui.json
  • 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1218 (T418465)', diff saved to https://phabricator.wikimedia.org/P89586 and previous config saved to /var/cache/conftool/dbconfig/20260302-212345-marostegui.json
  • 21:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T418465)', diff saved to https://phabricator.wikimedia.org/P89585 and previous config saved to /var/cache/conftool/dbconfig/20260302-212321-marostegui.json
  • 21:20 kemayo@deploy2002: esanders, kemayo, caro: Continuing with sync
  • 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-backup2004.codfw.wmnet with OS trixie
  • 21:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-backup2003.codfw.wmnet with OS trixie
  • 21:16 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Testing removal of OpenJDK 8 support - eevans@cumin1003
  • 21:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-backup2003']
  • 21:15 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-backup2003']
  • 21:15 kemayo@deploy2002: esanders, kemayo, caro: Backport for Suggestion Mode: add values for suggestion feedback properties (T401739), Stop PasteCheck A/B test (T417429) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:14 inflatador: bking@apt1002 reprepro --component thirdparty/opensearch3 update trixie-wikimedia T418388
  • 21:13 kemayo@deploy2002: Started scap sync-world: Backport for Suggestion Mode: add values for suggestion feedback properties (T401739), Stop PasteCheck A/B test (T417429)
  • 21:12 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-backup2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 21:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-backup2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 21:10 dani@deploy2002: Finished scap sync-world: Backport for Undeploy Comparative Reader Research survey on eswiki (T417834), Undeploy Comparative Reader Research survey on enwiki (T417829) (duration: 06m 52s)
  • 21:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P89584 and previous config saved to /var/cache/conftool/dbconfig/20260302-210919-marostegui.json
  • 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P89583 and previous config saved to /var/cache/conftool/dbconfig/20260302-210813-marostegui.json
  • 21:06 dani@deploy2002: dani: Continuing with sync
  • 21:05 dani@deploy2002: dani: Backport for Undeploy Comparative Reader Research survey on eswiki (T417834), Undeploy Comparative Reader Research survey on enwiki (T417829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:03 dani@deploy2002: Started scap sync-world: Backport for Undeploy Comparative Reader Research survey on eswiki (T417834), Undeploy Comparative Reader Research survey on enwiki (T417829)
  • 20:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-backup2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-backup2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-backup2004
  • 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-backup2004
  • 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-backup2003
  • 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-backup2003
  • 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-backup2003 to codfw - jhancock@cumin2002"
  • 20:54 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-backup2003 to codfw - jhancock@cumin2002"
  • 20:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T418465)', diff saved to https://phabricator.wikimedia.org/P89582 and previous config saved to /var/cache/conftool/dbconfig/20260302-205411-marostegui.json
  • 20:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P89581 and previous config saved to /var/cache/conftool/dbconfig/20260302-205307-marostegui.json
  • 20:50 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 20:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2174 (T418465)', diff saved to https://phabricator.wikimedia.org/P89580 and previous config saved to /var/cache/conftool/dbconfig/20260302-204136-marostegui.json
  • 20:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T418465)', diff saved to https://phabricator.wikimedia.org/P89579 and previous config saved to /var/cache/conftool/dbconfig/20260302-204112-marostegui.json
  • 20:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T418465)', diff saved to https://phabricator.wikimedia.org/P89578 and previous config saved to /var/cache/conftool/dbconfig/20260302-203759-marostegui.json
  • 20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:31 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1206 (T418465)', diff saved to https://phabricator.wikimedia.org/P89577 and previous config saved to /var/cache/conftool/dbconfig/20260302-202740-marostegui.json
  • 20:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T418465)', diff saved to https://phabricator.wikimedia.org/P89576 and previous config saved to /var/cache/conftool/dbconfig/20260302-202716-marostegui.json
  • 20:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P89575 and previous config saved to /var/cache/conftool/dbconfig/20260302-202604-marostegui.json
  • 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P89574 and previous config saved to /var/cache/conftool/dbconfig/20260302-201209-marostegui.json
  • 20:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P89573 and previous config saved to /var/cache/conftool/dbconfig/20260302-201057-marostegui.json
  • 20:01 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
  • 20:00 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
  • 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P89572 and previous config saved to /var/cache/conftool/dbconfig/20260302-195702-marostegui.json
  • 19:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T418465)', diff saved to https://phabricator.wikimedia.org/P89571 and previous config saved to /var/cache/conftool/dbconfig/20260302-195549-marostegui.json
  • 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2173 (T418465)', diff saved to https://phabricator.wikimedia.org/P89570 and previous config saved to /var/cache/conftool/dbconfig/20260302-194435-marostegui.json
  • 19:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T418465)', diff saved to https://phabricator.wikimedia.org/P89569 and previous config saved to /var/cache/conftool/dbconfig/20260302-194411-marostegui.json
  • 19:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T418465)', diff saved to https://phabricator.wikimedia.org/P89568 and previous config saved to /var/cache/conftool/dbconfig/20260302-194155-marostegui.json
  • 19:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1196 (T418465)', diff saved to https://phabricator.wikimedia.org/P89566 and previous config saved to /var/cache/conftool/dbconfig/20260302-193119-marostegui.json
  • 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 19:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T418465)', diff saved to https://phabricator.wikimedia.org/P89565 and previous config saved to /var/cache/conftool/dbconfig/20260302-193046-marostegui.json
  • 19:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P89564 and previous config saved to /var/cache/conftool/dbconfig/20260302-192903-marostegui.json
  • 19:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P89563 and previous config saved to /var/cache/conftool/dbconfig/20260302-191539-marostegui.json
  • 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P89562 and previous config saved to /var/cache/conftool/dbconfig/20260302-191355-marostegui.json
  • 19:12 dzahn@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 19:12 dzahn@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 19:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2095.codfw.wmnet with OS bullseye
  • 19:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P89561 and previous config saved to /var/cache/conftool/dbconfig/20260302-190032-marostegui.json
  • 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T418465)', diff saved to https://phabricator.wikimedia.org/P89560 and previous config saved to /var/cache/conftool/dbconfig/20260302-185848-marostegui.json
  • 18:54 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
  • 18:53 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
  • 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2170 (T418465)', diff saved to https://phabricator.wikimedia.org/P89559 and previous config saved to /var/cache/conftool/dbconfig/20260302-184832-marostegui.json
  • 18:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T418465)', diff saved to https://phabricator.wikimedia.org/P89558 and previous config saved to /var/cache/conftool/dbconfig/20260302-184808-marostegui.json
  • 18:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T418465)', diff saved to https://phabricator.wikimedia.org/P89557 and previous config saved to /var/cache/conftool/dbconfig/20260302-184524-marostegui.json
  • 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1195 (T418465)', diff saved to https://phabricator.wikimedia.org/P89556 and previous config saved to /var/cache/conftool/dbconfig/20260302-183449-marostegui.json
  • 18:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1195.eqiad.wmnet with reason: Maintenance
  • 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T418465)', diff saved to https://phabricator.wikimedia.org/P89555 and previous config saved to /var/cache/conftool/dbconfig/20260302-183425-marostegui.json
  • 18:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P89554 and previous config saved to /var/cache/conftool/dbconfig/20260302-183300-marostegui.json
  • 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P89553 and previous config saved to /var/cache/conftool/dbconfig/20260302-181918-marostegui.json
  • 18:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P89552 and previous config saved to /var/cache/conftool/dbconfig/20260302-181753-marostegui.json
  • 18:16 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
  • 18:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P89551 and previous config saved to /var/cache/conftool/dbconfig/20260302-180411-marostegui.json
  • 18:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T418465)', diff saved to https://phabricator.wikimedia.org/P89550 and previous config saved to /var/cache/conftool/dbconfig/20260302-180245-marostegui.json
  • 18:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:53 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
  • 17:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
  • 17:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
  • 17:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
  • 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2153 (T418465)', diff saved to https://phabricator.wikimedia.org/P89549 and previous config saved to /var/cache/conftool/dbconfig/20260302-174917-marostegui.json
  • 17:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 17:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T418465)', diff saved to https://phabricator.wikimedia.org/P89548 and previous config saved to /var/cache/conftool/dbconfig/20260302-174903-marostegui.json
  • 17:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T418465)', diff saved to https://phabricator.wikimedia.org/P89547 and previous config saved to /var/cache/conftool/dbconfig/20260302-174854-marostegui.json
  • 17:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
  • 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
  • 17:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
  • 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
  • 17:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
  • 17:39 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1186 (T418465)', diff saved to https://phabricator.wikimedia.org/P89546 and previous config saved to /var/cache/conftool/dbconfig/20260302-173827-marostegui.json
  • 17:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T418465)', diff saved to https://phabricator.wikimedia.org/P89545 and previous config saved to /var/cache/conftool/dbconfig/20260302-173803-marostegui.json
  • 17:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
  • 17:36 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
  • 17:34 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
  • 17:33 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
  • 17:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P89544 and previous config saved to /var/cache/conftool/dbconfig/20260302-173347-marostegui.json
  • 17:32 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.update-replication (exit_code=99)
  • 17:32 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
  • 17:24 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 17:23 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
  • 17:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P89543 and previous config saved to /var/cache/conftool/dbconfig/20260302-172256-marostegui.json
  • 17:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P89542 and previous config saved to /var/cache/conftool/dbconfig/20260302-171839-marostegui.json
  • 17:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P89541 and previous config saved to /var/cache/conftool/dbconfig/20260302-170748-marostegui.json
  • 17:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T418465)', diff saved to https://phabricator.wikimedia.org/P89540 and previous config saved to /var/cache/conftool/dbconfig/20260302-170331-marostegui.json
  • 16:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2230.codfw.wmnet with OS trixie
  • 16:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T418465)', diff saved to https://phabricator.wikimedia.org/P89539 and previous config saved to /var/cache/conftool/dbconfig/20260302-165240-marostegui.json
  • 16:51 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2146 (T418465)', diff saved to https://phabricator.wikimedia.org/P89538 and previous config saved to /var/cache/conftool/dbconfig/20260302-165153-marostegui.json
  • 16:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T418465)', diff saved to https://phabricator.wikimedia.org/P89537 and previous config saved to /var/cache/conftool/dbconfig/20260302-165129-marostegui.json
  • 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1184 (T418465)', diff saved to https://phabricator.wikimedia.org/P89536 and previous config saved to /var/cache/conftool/dbconfig/20260302-164141-marostegui.json
  • 16:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T418465)', diff saved to https://phabricator.wikimedia.org/P89535 and previous config saved to /var/cache/conftool/dbconfig/20260302-164118-marostegui.json
  • 16:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P89534 and previous config saved to /var/cache/conftool/dbconfig/20260302-163622-marostegui.json
  • 16:29 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2230.codfw.wmnet with reason: host reimage
  • 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P89533 and previous config saved to /var/cache/conftool/dbconfig/20260302-162610-marostegui.json
  • 16:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2230.codfw.wmnet with reason: host reimage
  • 16:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P89532 and previous config saved to /var/cache/conftool/dbconfig/20260302-162115-marostegui.json
  • 16:19 moritzm: installing PAM security updates on Bookworm
  • 16:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P89531 and previous config saved to /var/cache/conftool/dbconfig/20260302-161102-marostegui.json
  • 16:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T418465)', diff saved to https://phabricator.wikimedia.org/P89530 and previous config saved to /var/cache/conftool/dbconfig/20260302-160607-marostegui.json
  • 16:05 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db2230.codfw.wmnet with OS trixie
  • 15:56 moritzm: installing glibc bugfix updates from trixie point release
  • 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T418465)', diff saved to https://phabricator.wikimedia.org/P89529 and previous config saved to /var/cache/conftool/dbconfig/20260302-155555-marostegui.json
  • 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2145 (T418465)', diff saved to https://phabricator.wikimedia.org/P89528 and previous config saved to /var/cache/conftool/dbconfig/20260302-155527-marostegui.json
  • 15:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 15:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1169.eqiad.wmnet
  • 15:45 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1169 (T418465)', diff saved to https://phabricator.wikimedia.org/P89527 and previous config saved to /var/cache/conftool/dbconfig/20260302-154520-marostegui.json
  • 15:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 15:38 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
  • 15:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1169.eqiad.wmnet
  • 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1167.eqiad.wmnet
  • 15:32 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 15:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 15:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 15:31 marostegui@cumin1003: dbctl commit (dc=all): 'Restore db1226 full weight after schema change', diff saved to https://phabricator.wikimedia.org/P89526 and previous config saved to /var/cache/conftool/dbconfig/20260302-153100-marostegui.json
  • 15:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P89525 and previous config saved to /var/cache/conftool/dbconfig/20260302-152334-marostegui.json
  • 15:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1167.eqiad.wmnet
  • 15:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1166.eqiad.wmnet
  • 15:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T418465)', diff saved to https://phabricator.wikimedia.org/P89524 and previous config saved to /var/cache/conftool/dbconfig/20260302-151838-marostegui.json
  • 15:10 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1166.eqiad.wmnet
  • 15:10 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1165.eqiad.wmnet
  • 15:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P89523 and previous config saved to /var/cache/conftool/dbconfig/20260302-150826-marostegui.json
  • 15:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P89522 and previous config saved to /var/cache/conftool/dbconfig/20260302-150330-marostegui.json
  • 15:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1097.eqiad.wmnet with OS bullseye
  • 15:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 14:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1165.eqiad.wmnet
  • 14:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1164.eqiad.wmnet
  • 14:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T418465)', diff saved to https://phabricator.wikimedia.org/P89520 and previous config saved to /var/cache/conftool/dbconfig/20260302-145318-marostegui.json
  • 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1164.eqiad.wmnet
  • 14:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1163.eqiad.wmnet
  • 14:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P89519 and previous config saved to /var/cache/conftool/dbconfig/20260302-144823-marostegui.json
  • 14:41 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:40 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for IPInfo: Set log level to "info" (T374718) (duration: 08m 01s)
  • 14:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1163.eqiad.wmnet
  • 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1162.eqiad.wmnet
  • 14:36 lucaswerkmeister-wmde@deploy2002: kharlan, lucaswerkmeister-wmde: Continuing with sync
  • 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1226 (T418465)', diff saved to https://phabricator.wikimedia.org/P89517 and previous config saved to /var/cache/conftool/dbconfig/20260302-143608-marostegui.json
  • 14:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T418465)', diff saved to https://phabricator.wikimedia.org/P89516 and previous config saved to /var/cache/conftool/dbconfig/20260302-143544-marostegui.json
  • 14:34 lucaswerkmeister-wmde@deploy2002: kharlan, lucaswerkmeister-wmde: Backport for IPInfo: Set log level to "info" (T374718) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T418465)', diff saved to https://phabricator.wikimedia.org/P89515 and previous config saved to /var/cache/conftool/dbconfig/20260302-143315-marostegui.json
  • 14:32 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for IPInfo: Set log level to "info" (T374718)
  • 14:31 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
  • 14:30 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Add configurations for graphql usage survey and its pipeline tests (T414476) (duration: 09m 44s)
  • 14:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 14:26 lucaswerkmeister-wmde@deploy2002: itamar, lucaswerkmeister-wmde: Continuing with sync
  • 14:26 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
  • 14:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1162.eqiad.wmnet
  • 14:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1161.eqiad.wmnet
  • 14:23 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
  • 14:22 lucaswerkmeister-wmde@deploy2002: itamar, lucaswerkmeister-wmde: Backport for Add configurations for graphql usage survey and its pipeline tests (T414476) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:20 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Add configurations for graphql usage survey and its pipeline tests (T414476)
  • 14:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P89514 and previous config saved to /var/cache/conftool/dbconfig/20260302-142037-marostegui.json
  • 14:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:18 lucaswerkmeister-wmde@deploy2002: mwscript-k8s job started: namespaceDupes lawiki --fix # T418706
  • 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2195 (T418465)', diff saved to https://phabricator.wikimedia.org/P89513 and previous config saved to /var/cache/conftool/dbconfig/20260302-141834-marostegui.json
  • 14:18 elukey@puppetserver1001: conftool action : set/pooled=no; selector: name=ms-fe1013.eqiad.wmnet
  • 14:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
  • 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T418465)', diff saved to https://phabricator.wikimedia.org/P89512 and previous config saved to /var/cache/conftool/dbconfig/20260302-141810-marostegui.json
  • 14:17 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for lawiki: add Adumbratio (draft) namespace (T418706) (duration: 07m 27s)
  • 14:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
  • 14:13 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Continuing with sync
  • 14:13 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
  • 14:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1161.eqiad.wmnet
  • 14:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1160.eqiad.wmnet
  • 14:13 moritzm: installing libcap2 updates from Trixie point release
  • 14:12 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Backport for lawiki: add Adumbratio (draft) namespace (T418706) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:11 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 14:11 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 14:10 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
  • 14:10 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for lawiki: add Adumbratio (draft) namespace (T418706)
  • 14:10 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
  • 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1028.eqiad.wmnet
  • 14:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
  • 14:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P89511 and previous config saved to /var/cache/conftool/dbconfig/20260302-140529-marostegui.json
  • 14:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1097.eqiad.wmnet with reason: host reimage
  • 14:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P89510 and previous config saved to /var/cache/conftool/dbconfig/20260302-140302-marostegui.json
  • 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1028.eqiad.wmnet
  • 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1160.eqiad.wmnet
  • 14:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
  • 14:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1159.eqiad.wmnet
  • 14:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1025.eqiad.wmnet
  • 13:57 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1097.eqiad.wmnet with reason: host reimage
  • 13:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1025.eqiad.wmnet
  • 13:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T418465)', diff saved to https://phabricator.wikimedia.org/P89509 and previous config saved to /var/cache/conftool/dbconfig/20260302-135021-marostegui.json
  • 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P89508 and previous config saved to /var/cache/conftool/dbconfig/20260302-134754-marostegui.json
  • 13:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1159.eqiad.wmnet
  • 13:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1158.eqiad.wmnet
  • 13:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1097.eqiad.wmnet with OS bullseye
  • 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1097
  • 13:38 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1097
  • 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt ms-be1097 - jclark@cumin1003"
  • 13:37 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt ms-be1097 - jclark@cumin1003"
  • 13:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1158.eqiad.wmnet
  • 13:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1157.eqiad.wmnet
  • 13:35 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
  • 13:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1214 (T418465)', diff saved to https://phabricator.wikimedia.org/P89507 and previous config saved to /var/cache/conftool/dbconfig/20260302-133503-marostegui.json
  • 13:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 13:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T418465)', diff saved to https://phabricator.wikimedia.org/P89506 and previous config saved to /var/cache/conftool/dbconfig/20260302-133440-marostegui.json
  • 13:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T418465)', diff saved to https://phabricator.wikimedia.org/P89505 and previous config saved to /var/cache/conftool/dbconfig/20260302-133247-marostegui.json
  • 13:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
  • 13:27 jclark@cumin1003: START - Cookbook sre.dns.netbox
  • 13:27 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1097
  • 13:26 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1097
  • 13:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1157.eqiad.wmnet
  • 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1156.eqiad.wmnet
  • 13:22 brouberol: Running `echo 'https://turnilo-next.wikimedia.org' | mwscript-k8s --attach -- purgeList.php`
  • 13:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P89504 and previous config saved to /var/cache/conftool/dbconfig/20260302-131932-marostegui.json
  • 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2181 (T418465)', diff saved to https://phabricator.wikimedia.org/P89503 and previous config saved to /var/cache/conftool/dbconfig/20260302-131653-marostegui.json
  • 13:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T418465)', diff saved to https://phabricator.wikimedia.org/P89502 and previous config saved to /var/cache/conftool/dbconfig/20260302-131630-marostegui.json
  • 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1024.eqiad.wmnet
  • 13:14 moritzm: installing libcap2 updates from Bookworm point release
  • 13:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1156.eqiad.wmnet
  • 13:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1155.eqiad.wmnet
  • 13:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1024.eqiad.wmnet
  • 13:07 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
  • 13:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P89500 and previous config saved to /var/cache/conftool/dbconfig/20260302-130424-marostegui.json
  • 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P89499 and previous config saved to /var/cache/conftool/dbconfig/20260302-130122-marostegui.json
  • 13:00 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
  • 12:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2356.codfw.wmnet
  • 12:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2356.codfw.wmnet
  • 12:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1155.eqiad.wmnet
  • 12:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1154.eqiad.wmnet
  • 12:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T418465)', diff saved to https://phabricator.wikimedia.org/P89498 and previous config saved to /var/cache/conftool/dbconfig/20260302-124917-marostegui.json
  • 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1154.eqiad.wmnet
  • 12:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1153.eqiad.wmnet
  • 12:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P89497 and previous config saved to /var/cache/conftool/dbconfig/20260302-124615-marostegui.json
  • 12:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1153.eqiad.wmnet
  • 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1152.eqiad.wmnet
  • 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1203 (T418465)', diff saved to https://phabricator.wikimedia.org/P89494 and previous config saved to /var/cache/conftool/dbconfig/20260302-123253-marostegui.json
  • 12:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T418465)', diff saved to https://phabricator.wikimedia.org/P89493 and previous config saved to /var/cache/conftool/dbconfig/20260302-123229-marostegui.json
  • 12:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T418465)', diff saved to https://phabricator.wikimedia.org/P89492 and previous config saved to /var/cache/conftool/dbconfig/20260302-123108-marostegui.json
  • 12:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
  • 12:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
  • 12:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1152.eqiad.wmnet
  • 12:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1151.eqiad.wmnet
  • 12:23 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 12:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
  • 12:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
  • 12:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P89491 and previous config saved to /var/cache/conftool/dbconfig/20260302-121722-marostegui.json
  • 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2167 (T418465)', diff saved to https://phabricator.wikimedia.org/P89490 and previous config saved to /var/cache/conftool/dbconfig/20260302-121525-marostegui.json
  • 12:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T418465)', diff saved to https://phabricator.wikimedia.org/P89489 and previous config saved to /var/cache/conftool/dbconfig/20260302-121501-marostegui.json
  • 12:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1151.eqiad.wmnet
  • 12:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1150.eqiad.wmnet
  • 12:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P89488 and previous config saved to /var/cache/conftool/dbconfig/20260302-120214-marostegui.json
  • 12:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1150.eqiad.wmnet
  • 11:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P89487 and previous config saved to /var/cache/conftool/dbconfig/20260302-115953-marostegui.json
  • 11:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T418465)', diff saved to https://phabricator.wikimedia.org/P89486 and previous config saved to /var/cache/conftool/dbconfig/20260302-114706-marostegui.json
  • 11:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P89485 and previous config saved to /var/cache/conftool/dbconfig/20260302-114446-marostegui.json
  • 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1193 (T418465)', diff saved to https://phabricator.wikimedia.org/P89484 and previous config saved to /var/cache/conftool/dbconfig/20260302-113034-marostegui.json
  • 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
  • 11:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T418465)', diff saved to https://phabricator.wikimedia.org/P89483 and previous config saved to /var/cache/conftool/dbconfig/20260302-113010-marostegui.json
  • 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T418465)', diff saved to https://phabricator.wikimedia.org/P89482 and previous config saved to /var/cache/conftool/dbconfig/20260302-112937-marostegui.json
  • 11:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
  • 11:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P89481 and previous config saved to /var/cache/conftool/dbconfig/20260302-111502-marostegui.json
  • 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2166 (T418465)', diff saved to https://phabricator.wikimedia.org/P89480 and previous config saved to /var/cache/conftool/dbconfig/20260302-111351-marostegui.json
  • 11:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T418465)', diff saved to https://phabricator.wikimedia.org/P89479 and previous config saved to /var/cache/conftool/dbconfig/20260302-111327-marostegui.json
  • 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
  • 10:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P89478 and previous config saved to /var/cache/conftool/dbconfig/20260302-105955-marostegui.json
  • 10:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P89477 and previous config saved to /var/cache/conftool/dbconfig/20260302-105818-marostegui.json
  • 10:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
  • 10:55 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 10:54 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 10:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
  • 10:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
  • 10:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
  • 10:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
  • 10:46 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru and A:cp - 3.0 upgrade ()
  • 10:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T418465)', diff saved to https://phabricator.wikimedia.org/P89476 and previous config saved to /var/cache/conftool/dbconfig/20260302-104446-marostegui.json
  • 10:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P89475 and previous config saved to /var/cache/conftool/dbconfig/20260302-104310-marostegui.json
  • 10:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1192 (T418465)', diff saved to https://phabricator.wikimedia.org/P89474 and previous config saved to /var/cache/conftool/dbconfig/20260302-102825-marostegui.json
  • 10:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 10:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T418465)', diff saved to https://phabricator.wikimedia.org/P89473 and previous config saved to /var/cache/conftool/dbconfig/20260302-102800-marostegui.json
  • 10:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P89472 and previous config saved to /var/cache/conftool/dbconfig/20260302-101252-marostegui.json
  • 10:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2164 (T418465)', diff saved to https://phabricator.wikimedia.org/P89471 and previous config saved to /var/cache/conftool/dbconfig/20260302-101200-marostegui.json
  • 10:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T418465)', diff saved to https://phabricator.wikimedia.org/P89470 and previous config saved to /var/cache/conftool/dbconfig/20260302-101135-marostegui.json
  • 10:08 moritzm: installing intel-microcode bugfix updates on Bookworm hosts
  • 09:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P89469 and previous config saved to /var/cache/conftool/dbconfig/20260302-095744-marostegui.json
  • 09:57 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru and A:cp - 3.0 upgrade ()
  • 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P89468 and previous config saved to /var/cache/conftool/dbconfig/20260302-095627-marostegui.json
  • 09:55 fabfur: start upgrading haproxy to 3.0 on A:cp-text_magru (T417253)
  • 09:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T418465)', diff saved to https://phabricator.wikimedia.org/P89467 and previous config saved to /var/cache/conftool/dbconfig/20260302-094236-marostegui.json
  • 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P89466 and previous config saved to /var/cache/conftool/dbconfig/20260302-094118-marostegui.json
  • 09:35 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:35 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:34 moritzm: installing gnu TLS security updates
  • 09:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:33 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T418465)', diff saved to https://phabricator.wikimedia.org/P89465 and previous config saved to /var/cache/conftool/dbconfig/20260302-092610-marostegui.json
  • 09:26 mlitn@deploy2002: Finished scap sync-world: Backport for Limit additional whitespace to sticky header version only (T416598) (duration: 11m 02s)
  • 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1178 (T418465)', diff saved to https://phabricator.wikimedia.org/P89464 and previous config saved to /var/cache/conftool/dbconfig/20260302-092600-marostegui.json
  • 09:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 09:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T418465)', diff saved to https://phabricator.wikimedia.org/P89463 and previous config saved to /var/cache/conftool/dbconfig/20260302-092535-marostegui.json
  • 09:21 mlitn@deploy2002: mlitn: Continuing with sync
  • 09:16 mlitn@deploy2002: mlitn: Backport for Limit additional whitespace to sticky header version only (T416598) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:15 mlitn@deploy2002: Started scap sync-world: Backport for Limit additional whitespace to sticky header version only (T416598)
  • 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P89462 and previous config saved to /var/cache/conftool/dbconfig/20260302-091027-marostegui.json
  • 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2163 (T418465)', diff saved to https://phabricator.wikimedia.org/P89461 and previous config saved to /var/cache/conftool/dbconfig/20260302-091003-marostegui.json
  • 09:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 09:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T418465)', diff saved to https://phabricator.wikimedia.org/P89460 and previous config saved to /var/cache/conftool/dbconfig/20260302-090938-marostegui.json
  • 09:08 kharlan@deploy2002: Finished scap sync-world: Backport for HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477) (duration: 16m 09s)
  • 09:02 kharlan@deploy2002: kharlan: Continuing with sync
  • 08:57 kharlan@deploy2002: kharlan: Backport for HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P89459 and previous config saved to /var/cache/conftool/dbconfig/20260302-085519-marostegui.json
  • 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P89458 and previous config saved to /var/cache/conftool/dbconfig/20260302-085430-marostegui.json
  • 08:51 kharlan@deploy2002: Started scap sync-world: Backport for HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)
  • 08:48 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru and A:cp - 3.0 upgrade ()
  • 08:47 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 08:45 moritzm: installing libxml2 security updates
  • 08:44 kgraessle@deploy2002: Finished scap sync-world: Backport for Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485) (duration: 37m 12s)
  • 08:42 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 08:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 08:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T418465)', diff saved to https://phabricator.wikimedia.org/P89457 and previous config saved to /var/cache/conftool/dbconfig/20260302-084010-marostegui.json
  • 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P89456 and previous config saved to /var/cache/conftool/dbconfig/20260302-083922-marostegui.json
  • 08:31 kgraessle@deploy2002: kgraessle: Continuing with sync
  • 08:30 kgraessle@deploy2002: kgraessle: Backport for Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T418465)', diff saved to https://phabricator.wikimedia.org/P89455 and previous config saved to /var/cache/conftool/dbconfig/20260302-082414-marostegui.json
  • 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1177 (T418465)', diff saved to https://phabricator.wikimedia.org/P89454 and previous config saved to /var/cache/conftool/dbconfig/20260302-082333-marostegui.json
  • 08:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T418465)', diff saved to https://phabricator.wikimedia.org/P89453 and previous config saved to /var/cache/conftool/dbconfig/20260302-082309-marostegui.json
  • 08:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbproxy1028.eqiad.wmnet with reason: Maintenance
  • 08:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbproxy1029.eqiad.wmnet with reason: Maintenance
  • 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2161 (T418465)', diff saved to https://phabricator.wikimedia.org/P89452 and previous config saved to /var/cache/conftool/dbconfig/20260302-080813-marostegui.json
  • 08:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P89451 and previous config saved to /var/cache/conftool/dbconfig/20260302-080800-marostegui.json
  • 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T418465)', diff saved to https://phabricator.wikimedia.org/P89450 and previous config saved to /var/cache/conftool/dbconfig/20260302-080748-marostegui.json
  • 08:07 kgraessle@deploy2002: Started scap sync-world: Backport for Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)
  • 08:05 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru and A:cp - 3.0 upgrade ()
  • 08:05 fabfur: start upgrading haproxy to 3.0 on A:cp-upload_magru (T417253)
  • 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P89449 and previous config saved to /var/cache/conftool/dbconfig/20260302-075252-marostegui.json
  • 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P89448 and previous config saved to /var/cache/conftool/dbconfig/20260302-075241-marostegui.json
  • 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T418465)', diff saved to https://phabricator.wikimedia.org/P89447 and previous config saved to /var/cache/conftool/dbconfig/20260302-073745-marostegui.json
  • 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P89446 and previous config saved to /var/cache/conftool/dbconfig/20260302-073732-marostegui.json
  • 07:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T418465)', diff saved to https://phabricator.wikimedia.org/P89445 and previous config saved to /var/cache/conftool/dbconfig/20260302-072224-marostegui.json
  • 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1172 (T418465)', diff saved to https://phabricator.wikimedia.org/P89444 and previous config saved to /var/cache/conftool/dbconfig/20260302-072058-marostegui.json
  • 07:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T418465)', diff saved to https://phabricator.wikimedia.org/P89443 and previous config saved to /var/cache/conftool/dbconfig/20260302-070523-marostegui.json
  • 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2154 (T418465)', diff saved to https://phabricator.wikimedia.org/P89442 and previous config saved to /var/cache/conftool/dbconfig/20260302-070512-marostegui.json
  • 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 07:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T418465)', diff saved to https://phabricator.wikimedia.org/P89441 and previous config saved to /var/cache/conftool/dbconfig/20260302-070447-marostegui.json
  • 07:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1244: After schema change
  • 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P89439 and previous config saved to /var/cache/conftool/dbconfig/20260302-065014-marostegui.json
  • 06:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P89438 and previous config saved to /var/cache/conftool/dbconfig/20260302-064938-marostegui.json
  • 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P89436 and previous config saved to /var/cache/conftool/dbconfig/20260302-063506-marostegui.json
  • 06:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P89435 and previous config saved to /var/cache/conftool/dbconfig/20260302-063430-marostegui.json
  • 06:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T418465)', diff saved to https://phabricator.wikimedia.org/P89433 and previous config saved to /var/cache/conftool/dbconfig/20260302-061957-marostegui.json
  • 06:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T418465)', diff saved to https://phabricator.wikimedia.org/P89432 and previous config saved to /var/cache/conftool/dbconfig/20260302-061922-marostegui.json
  • 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance
  • 06:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1244: After schema change
  • 06:15 marostegui@dns1004: END - running authdns-update
  • 06:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2240 T418080', diff saved to https://phabricator.wikimedia.org/P89430 and previous config saved to /var/cache/conftool/dbconfig/20260302-061428-marostegui.json
  • 06:13 marostegui@dns1004: START - running authdns-update
  • 06:13 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2179 to s4 primary and set section read-write T418080', diff saved to https://phabricator.wikimedia.org/P89429 and previous config saved to /var/cache/conftool/dbconfig/20260302-061316-marostegui.json
  • 06:12 marostegui@cumin1003: dbctl commit (dc=all): 'Set s4 codfw as read-only for maintenance - T418080', diff saved to https://phabricator.wikimedia.org/P89428 and previous config saved to /var/cache/conftool/dbconfig/20260302-061252-marostegui.json
  • 06:06 marostegui: Starting s4 codfw failover from db2240 to db2179 - T418080
  • 06:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 42 hosts with reason: Primary switchover s4 T418080
  • 06:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2179 with weight 0 T418080', diff saved to https://phabricator.wikimedia.org/P89427 and previous config saved to /var/cache/conftool/dbconfig/20260302-060317-marostegui.json
  • 06:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1167 (T418465)', diff saved to https://phabricator.wikimedia.org/P89426 and previous config saved to /var/cache/conftool/dbconfig/20260302-060317-marostegui.json
  • 06:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 06:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 06:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2152 (T418465)', diff saved to https://phabricator.wikimedia.org/P89425 and previous config saved to /var/cache/conftool/dbconfig/20260302-060245-marostegui.json
  • 06:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 02:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 13s)
  • 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 00:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 00:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T418465)', diff saved to https://phabricator.wikimedia.org/P89424 and previous config saved to /var/cache/conftool/dbconfig/20260302-004950-marostegui.json
  • 00:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P89423 and previous config saved to /var/cache/conftool/dbconfig/20260302-003441-marostegui.json
  • 00:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P89422 and previous config saved to /var/cache/conftool/dbconfig/20260302-001933-marostegui.json
  • 00:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T418465)', diff saved to https://phabricator.wikimedia.org/P89421 and previous config saved to /var/cache/conftool/dbconfig/20260302-000425-marostegui.json
  • 00:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1253 (T418465)', diff saved to https://phabricator.wikimedia.org/P89420 and previous config saved to /var/cache/conftool/dbconfig/20260302-000208-marostegui.json
  • 00:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Maintenance
  • 00:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T418465)', diff saved to https://phabricator.wikimedia.org/P89419 and previous config saved to /var/cache/conftool/dbconfig/20260302-000143-marostegui.json

2026-03-01

  • 23:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P89418 and previous config saved to /var/cache/conftool/dbconfig/20260301-234635-marostegui.json
  • 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T418465)', diff saved to https://phabricator.wikimedia.org/P89417 and previous config saved to /var/cache/conftool/dbconfig/20260301-233524-marostegui.json
  • 23:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P89416 and previous config saved to /var/cache/conftool/dbconfig/20260301-233127-marostegui.json
  • 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P89415 and previous config saved to /var/cache/conftool/dbconfig/20260301-232016-marostegui.json
  • 23:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T418465)', diff saved to https://phabricator.wikimedia.org/P89414 and previous config saved to /var/cache/conftool/dbconfig/20260301-231619-marostegui.json
  • 23:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1236 (T418465)', diff saved to https://phabricator.wikimedia.org/P89413 and previous config saved to /var/cache/conftool/dbconfig/20260301-231404-marostegui.json
  • 23:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 23:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T418465)', diff saved to https://phabricator.wikimedia.org/P89412 and previous config saved to /var/cache/conftool/dbconfig/20260301-231339-marostegui.json
  • 23:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P89411 and previous config saved to /var/cache/conftool/dbconfig/20260301-230508-marostegui.json
  • 22:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P89410 and previous config saved to /var/cache/conftool/dbconfig/20260301-225832-marostegui.json
  • 22:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T418465)', diff saved to https://phabricator.wikimedia.org/P89409 and previous config saved to /var/cache/conftool/dbconfig/20260301-224959-marostegui.json
  • 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (T418465)', diff saved to https://phabricator.wikimedia.org/P89408 and previous config saved to /var/cache/conftool/dbconfig/20260301-224451-marostegui.json
  • 22:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Maintenance
  • 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T418465)', diff saved to https://phabricator.wikimedia.org/P89407 and previous config saved to /var/cache/conftool/dbconfig/20260301-224426-marostegui.json
  • 22:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P89406 and previous config saved to /var/cache/conftool/dbconfig/20260301-224324-marostegui.json
  • 22:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P89405 and previous config saved to /var/cache/conftool/dbconfig/20260301-222919-marostegui.json
  • 22:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T418465)', diff saved to https://phabricator.wikimedia.org/P89404 and previous config saved to /var/cache/conftool/dbconfig/20260301-222815-marostegui.json
  • 22:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1231 (T418465)', diff saved to https://phabricator.wikimedia.org/P89403 and previous config saved to /var/cache/conftool/dbconfig/20260301-222600-marostegui.json
  • 22:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 22:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T418465)', diff saved to https://phabricator.wikimedia.org/P89402 and previous config saved to /var/cache/conftool/dbconfig/20260301-222536-marostegui.json
  • 22:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P89401 and previous config saved to /var/cache/conftool/dbconfig/20260301-221410-marostegui.json
  • 22:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P89400 and previous config saved to /var/cache/conftool/dbconfig/20260301-221027-marostegui.json
  • 21:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T418465)', diff saved to https://phabricator.wikimedia.org/P89399 and previous config saved to /var/cache/conftool/dbconfig/20260301-215902-marostegui.json
  • 21:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P89398 and previous config saved to /var/cache/conftool/dbconfig/20260301-215519-marostegui.json
  • 21:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (T418465)', diff saved to https://phabricator.wikimedia.org/P89397 and previous config saved to /var/cache/conftool/dbconfig/20260301-215404-marostegui.json
  • 21:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Maintenance
  • 21:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T418465)', diff saved to https://phabricator.wikimedia.org/P89396 and previous config saved to /var/cache/conftool/dbconfig/20260301-215339-marostegui.json
  • 21:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T418465)', diff saved to https://phabricator.wikimedia.org/P89395 and previous config saved to /var/cache/conftool/dbconfig/20260301-214011-marostegui.json
  • 21:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P89394 and previous config saved to /var/cache/conftool/dbconfig/20260301-213831-marostegui.json
  • 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (T418465)', diff saved to https://phabricator.wikimedia.org/P89393 and previous config saved to /var/cache/conftool/dbconfig/20260301-213410-marostegui.json
  • 21:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 21:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T418465)', diff saved to https://phabricator.wikimedia.org/P89392 and previous config saved to /var/cache/conftool/dbconfig/20260301-213346-marostegui.json
  • 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P89391 and previous config saved to /var/cache/conftool/dbconfig/20260301-212323-marostegui.json
  • 21:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P89390 and previous config saved to /var/cache/conftool/dbconfig/20260301-211837-marostegui.json
  • 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T418465)', diff saved to https://phabricator.wikimedia.org/P89389 and previous config saved to /var/cache/conftool/dbconfig/20260301-210815-marostegui.json
  • 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P89388 and previous config saved to /var/cache/conftool/dbconfig/20260301-210329-marostegui.json
  • 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2218 (T418465)', diff saved to https://phabricator.wikimedia.org/P89387 and previous config saved to /var/cache/conftool/dbconfig/20260301-210309-marostegui.json
  • 21:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 21:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T418465)', diff saved to https://phabricator.wikimedia.org/P89386 and previous config saved to /var/cache/conftool/dbconfig/20260301-210244-marostegui.json
  • 20:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T418465)', diff saved to https://phabricator.wikimedia.org/P89385 and previous config saved to /var/cache/conftool/dbconfig/20260301-204820-marostegui.json
  • 20:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P89384 and previous config saved to /var/cache/conftool/dbconfig/20260301-204736-marostegui.json
  • 20:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1202 (T418465)', diff saved to https://phabricator.wikimedia.org/P89383 and previous config saved to /var/cache/conftool/dbconfig/20260301-204606-marostegui.json
  • 20:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 20:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T418465)', diff saved to https://phabricator.wikimedia.org/P89382 and previous config saved to /var/cache/conftool/dbconfig/20260301-204541-marostegui.json
  • 20:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P89381 and previous config saved to /var/cache/conftool/dbconfig/20260301-203227-marostegui.json
  • 20:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P89380 and previous config saved to /var/cache/conftool/dbconfig/20260301-203033-marostegui.json
  • 20:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T418465)', diff saved to https://phabricator.wikimedia.org/P89379 and previous config saved to /var/cache/conftool/dbconfig/20260301-201720-marostegui.json
  • 20:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P89378 and previous config saved to /var/cache/conftool/dbconfig/20260301-201525-marostegui.json
  • 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T418465)', diff saved to https://phabricator.wikimedia.org/P89377 and previous config saved to /var/cache/conftool/dbconfig/20260301-201212-marostegui.json
  • 20:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 20:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 20:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 20:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T418465)', diff saved to https://phabricator.wikimedia.org/P89376 and previous config saved to /var/cache/conftool/dbconfig/20260301-200422-marostegui.json
  • 20:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T418465)', diff saved to https://phabricator.wikimedia.org/P89375 and previous config saved to /var/cache/conftool/dbconfig/20260301-200016-marostegui.json
  • 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1194 (T418465)', diff saved to https://phabricator.wikimedia.org/P89374 and previous config saved to /var/cache/conftool/dbconfig/20260301-195803-marostegui.json
  • 19:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T418465)', diff saved to https://phabricator.wikimedia.org/P89373 and previous config saved to /var/cache/conftool/dbconfig/20260301-195738-marostegui.json
  • 19:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P89372 and previous config saved to /var/cache/conftool/dbconfig/20260301-194914-marostegui.json
  • 19:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P89371 and previous config saved to /var/cache/conftool/dbconfig/20260301-194230-marostegui.json
  • 19:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P89370 and previous config saved to /var/cache/conftool/dbconfig/20260301-193406-marostegui.json
  • 19:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P89369 and previous config saved to /var/cache/conftool/dbconfig/20260301-192721-marostegui.json
  • 19:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T418465)', diff saved to https://phabricator.wikimedia.org/P89368 and previous config saved to /var/cache/conftool/dbconfig/20260301-191858-marostegui.json
  • 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (T418465)', diff saved to https://phabricator.wikimedia.org/P89367 and previous config saved to /var/cache/conftool/dbconfig/20260301-191340-marostegui.json
  • 19:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T418465)', diff saved to https://phabricator.wikimedia.org/P89366 and previous config saved to /var/cache/conftool/dbconfig/20260301-191315-marostegui.json
  • 19:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T418465)', diff saved to https://phabricator.wikimedia.org/P89365 and previous config saved to /var/cache/conftool/dbconfig/20260301-191213-marostegui.json
  • 19:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1191 (T418465)', diff saved to https://phabricator.wikimedia.org/P89364 and previous config saved to /var/cache/conftool/dbconfig/20260301-190958-marostegui.json
  • 19:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 19:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T418465)', diff saved to https://phabricator.wikimedia.org/P89363 and previous config saved to /var/cache/conftool/dbconfig/20260301-190934-marostegui.json
  • 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P89362 and previous config saved to /var/cache/conftool/dbconfig/20260301-185807-marostegui.json
  • 18:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P89361 and previous config saved to /var/cache/conftool/dbconfig/20260301-185425-marostegui.json
  • 18:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P89360 and previous config saved to /var/cache/conftool/dbconfig/20260301-184259-marostegui.json
  • 18:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P89359 and previous config saved to /var/cache/conftool/dbconfig/20260301-183917-marostegui.json
  • 18:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T418465)', diff saved to https://phabricator.wikimedia.org/P89358 and previous config saved to /var/cache/conftool/dbconfig/20260301-182750-marostegui.json
  • 18:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T418465)', diff saved to https://phabricator.wikimedia.org/P89357 and previous config saved to /var/cache/conftool/dbconfig/20260301-182409-marostegui.json
  • 18:22 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (T418465)', diff saved to https://phabricator.wikimedia.org/P89356 and previous config saved to /var/cache/conftool/dbconfig/20260301-182238-marostegui.json
  • 18:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 18:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T418465)', diff saved to https://phabricator.wikimedia.org/P89355 and previous config saved to /var/cache/conftool/dbconfig/20260301-182213-marostegui.json
  • 18:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (T418465)', diff saved to https://phabricator.wikimedia.org/P89354 and previous config saved to /var/cache/conftool/dbconfig/20260301-182153-marostegui.json
  • 18:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 18:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 18:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T418465)', diff saved to https://phabricator.wikimedia.org/P89353 and previous config saved to /var/cache/conftool/dbconfig/20260301-181818-marostegui.json
  • 18:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P89352 and previous config saved to /var/cache/conftool/dbconfig/20260301-180705-marostegui.json
  • 18:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P89351 and previous config saved to /var/cache/conftool/dbconfig/20260301-180310-marostegui.json
  • 17:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P89350 and previous config saved to /var/cache/conftool/dbconfig/20260301-175157-marostegui.json
  • 17:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P89349 and previous config saved to /var/cache/conftool/dbconfig/20260301-174802-marostegui.json
  • 17:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T418465)', diff saved to https://phabricator.wikimedia.org/P89348 and previous config saved to /var/cache/conftool/dbconfig/20260301-173649-marostegui.json
  • 17:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T418465)', diff saved to https://phabricator.wikimedia.org/P89347 and previous config saved to /var/cache/conftool/dbconfig/20260301-173253-marostegui.json
  • 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (T418465)', diff saved to https://phabricator.wikimedia.org/P89346 and previous config saved to /var/cache/conftool/dbconfig/20260301-173134-marostegui.json
  • 17:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T418465)', diff saved to https://phabricator.wikimedia.org/P89345 and previous config saved to /var/cache/conftool/dbconfig/20260301-173110-marostegui.json
  • 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (T418465)', diff saved to https://phabricator.wikimedia.org/P89344 and previous config saved to /var/cache/conftool/dbconfig/20260301-172742-marostegui.json
  • 17:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T418465)', diff saved to https://phabricator.wikimedia.org/P89343 and previous config saved to /var/cache/conftool/dbconfig/20260301-172717-marostegui.json
  • 17:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P89342 and previous config saved to /var/cache/conftool/dbconfig/20260301-171602-marostegui.json
  • 17:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P89341 and previous config saved to /var/cache/conftool/dbconfig/20260301-171210-marostegui.json
  • 17:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P89340 and previous config saved to /var/cache/conftool/dbconfig/20260301-170053-marostegui.json
  • 16:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P89339 and previous config saved to /var/cache/conftool/dbconfig/20260301-165701-marostegui.json
  • 16:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T418465)', diff saved to https://phabricator.wikimedia.org/P89338 and previous config saved to /var/cache/conftool/dbconfig/20260301-164545-marostegui.json
  • 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T418465)', diff saved to https://phabricator.wikimedia.org/P89337 and previous config saved to /var/cache/conftool/dbconfig/20260301-164153-marostegui.json
  • 16:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2150 (T418465)', diff saved to https://phabricator.wikimedia.org/P89336 and previous config saved to /var/cache/conftool/dbconfig/20260301-164022-marostegui.json
  • 16:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 16:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (T418465)', diff saved to https://phabricator.wikimedia.org/P89335 and previous config saved to /var/cache/conftool/dbconfig/20260301-163938-marostegui.json
  • 16:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 16:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 16:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 16:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2213.codfw.wmnet with reason: Maintenance
  • 12:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T418465)', diff saved to https://phabricator.wikimedia.org/P89334 and previous config saved to /var/cache/conftool/dbconfig/20260301-122201-marostegui.json
  • 12:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P89333 and previous config saved to /var/cache/conftool/dbconfig/20260301-120652-marostegui.json
  • 11:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P89332 and previous config saved to /var/cache/conftool/dbconfig/20260301-115144-marostegui.json
  • 11:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T418465)', diff saved to https://phabricator.wikimedia.org/P89331 and previous config saved to /var/cache/conftool/dbconfig/20260301-113636-marostegui.json
  • 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2228 (T418465)', diff saved to https://phabricator.wikimedia.org/P89330 and previous config saved to /var/cache/conftool/dbconfig/20260301-113156-marostegui.json
  • 11:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2228.codfw.wmnet with reason: Maintenance
  • 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T418465)', diff saved to https://phabricator.wikimedia.org/P89329 and previous config saved to /var/cache/conftool/dbconfig/20260301-113131-marostegui.json
  • 11:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 11:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 11:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T418465)', diff saved to https://phabricator.wikimedia.org/P89328 and previous config saved to /var/cache/conftool/dbconfig/20260301-111658-marostegui.json
  • 11:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P89327 and previous config saved to /var/cache/conftool/dbconfig/20260301-111622-marostegui.json
  • 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P89326 and previous config saved to /var/cache/conftool/dbconfig/20260301-110151-marostegui.json
  • 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P89325 and previous config saved to /var/cache/conftool/dbconfig/20260301-110114-marostegui.json
  • 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P89324 and previous config saved to /var/cache/conftool/dbconfig/20260301-104642-marostegui.json
  • 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T418465)', diff saved to https://phabricator.wikimedia.org/P89323 and previous config saved to /var/cache/conftool/dbconfig/20260301-104606-marostegui.json
  • 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2223 (T418465)', diff saved to https://phabricator.wikimedia.org/P89322 and previous config saved to /var/cache/conftool/dbconfig/20260301-104024-marostegui.json
  • 10:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2223.codfw.wmnet with reason: Maintenance
  • 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T418465)', diff saved to https://phabricator.wikimedia.org/P89321 and previous config saved to /var/cache/conftool/dbconfig/20260301-103958-marostegui.json
  • 10:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T418465)', diff saved to https://phabricator.wikimedia.org/P89320 and previous config saved to /var/cache/conftool/dbconfig/20260301-103134-marostegui.json
  • 10:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1210 (T418465)', diff saved to https://phabricator.wikimedia.org/P89319 and previous config saved to /var/cache/conftool/dbconfig/20260301-102727-marostegui.json
  • 10:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 10:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T418465)', diff saved to https://phabricator.wikimedia.org/P89318 and previous config saved to /var/cache/conftool/dbconfig/20260301-102702-marostegui.json
  • 10:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P89317 and previous config saved to /var/cache/conftool/dbconfig/20260301-102450-marostegui.json
  • 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P89316 and previous config saved to /var/cache/conftool/dbconfig/20260301-101154-marostegui.json
  • 10:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P89315 and previous config saved to /var/cache/conftool/dbconfig/20260301-100942-marostegui.json
  • 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P89314 and previous config saved to /var/cache/conftool/dbconfig/20260301-095645-marostegui.json
  • 09:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T418465)', diff saved to https://phabricator.wikimedia.org/P89313 and previous config saved to /var/cache/conftool/dbconfig/20260301-095434-marostegui.json
  • 09:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (T418465)', diff saved to https://phabricator.wikimedia.org/P89312 and previous config saved to /var/cache/conftool/dbconfig/20260301-094847-marostegui.json
  • 09:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 09:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T418465)', diff saved to https://phabricator.wikimedia.org/P89311 and previous config saved to /var/cache/conftool/dbconfig/20260301-094432-marostegui.json
  • 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T418465)', diff saved to https://phabricator.wikimedia.org/P89310 and previous config saved to /var/cache/conftool/dbconfig/20260301-094137-marostegui.json
  • 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1207 (T418465)', diff saved to https://phabricator.wikimedia.org/P89309 and previous config saved to /var/cache/conftool/dbconfig/20260301-093835-marostegui.json
  • 09:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T418465)', diff saved to https://phabricator.wikimedia.org/P89308 and previous config saved to /var/cache/conftool/dbconfig/20260301-093810-marostegui.json
  • 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P89307 and previous config saved to /var/cache/conftool/dbconfig/20260301-092923-marostegui.json
  • 09:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P89306 and previous config saved to /var/cache/conftool/dbconfig/20260301-092302-marostegui.json
  • 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P89305 and previous config saved to /var/cache/conftool/dbconfig/20260301-091415-marostegui.json
  • 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P89304 and previous config saved to /var/cache/conftool/dbconfig/20260301-090754-marostegui.json
  • 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T418465)', diff saved to https://phabricator.wikimedia.org/P89303 and previous config saved to /var/cache/conftool/dbconfig/20260301-085907-marostegui.json
  • 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2192 (T418465)', diff saved to https://phabricator.wikimedia.org/P89302 and previous config saved to /var/cache/conftool/dbconfig/20260301-085427-marostegui.json
  • 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T418465)', diff saved to https://phabricator.wikimedia.org/P89301 and previous config saved to /var/cache/conftool/dbconfig/20260301-085403-marostegui.json
  • 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T418465)', diff saved to https://phabricator.wikimedia.org/P89300 and previous config saved to /var/cache/conftool/dbconfig/20260301-085246-marostegui.json
  • 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1200 (T418465)', diff saved to https://phabricator.wikimedia.org/P89299 and previous config saved to /var/cache/conftool/dbconfig/20260301-084952-marostegui.json
  • 08:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T418465)', diff saved to https://phabricator.wikimedia.org/P89298 and previous config saved to /var/cache/conftool/dbconfig/20260301-084928-marostegui.json
  • 08:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P89297 and previous config saved to /var/cache/conftool/dbconfig/20260301-083855-marostegui.json
  • 08:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P89296 and previous config saved to /var/cache/conftool/dbconfig/20260301-083420-marostegui.json
  • 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P89295 and previous config saved to /var/cache/conftool/dbconfig/20260301-082346-marostegui.json
  • 08:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P89294 and previous config saved to /var/cache/conftool/dbconfig/20260301-081912-marostegui.json
  • 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T418465)', diff saved to https://phabricator.wikimedia.org/P89293 and previous config saved to /var/cache/conftool/dbconfig/20260301-080838-marostegui.json
  • 08:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T418465)', diff saved to https://phabricator.wikimedia.org/P89292 and previous config saved to /var/cache/conftool/dbconfig/20260301-080404-marostegui.json
  • 08:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T418465)', diff saved to https://phabricator.wikimedia.org/P89291 and previous config saved to /var/cache/conftool/dbconfig/20260301-080341-marostegui.json
  • 08:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1185 (T418465)', diff saved to https://phabricator.wikimedia.org/P89290 and previous config saved to /var/cache/conftool/dbconfig/20260301-080110-marostegui.json
  • 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 08:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T418465)', diff saved to https://phabricator.wikimedia.org/P89289 and previous config saved to /var/cache/conftool/dbconfig/20260301-080044-marostegui.json
  • 07:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P89288 and previous config saved to /var/cache/conftool/dbconfig/20260301-074833-marostegui.json
  • 07:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P89287 and previous config saved to /var/cache/conftool/dbconfig/20260301-074536-marostegui.json
  • 07:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P89286 and previous config saved to /var/cache/conftool/dbconfig/20260301-073324-marostegui.json
  • 07:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P89285 and previous config saved to /var/cache/conftool/dbconfig/20260301-073028-marostegui.json
  • 07:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T418465)', diff saved to https://phabricator.wikimedia.org/P89284 and previous config saved to /var/cache/conftool/dbconfig/20260301-071816-marostegui.json
  • 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T418465)', diff saved to https://phabricator.wikimedia.org/P89283 and previous config saved to /var/cache/conftool/dbconfig/20260301-071521-marostegui.json
  • 07:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2171 (T418465)', diff saved to https://phabricator.wikimedia.org/P89282 and previous config saved to /var/cache/conftool/dbconfig/20260301-071226-marostegui.json
  • 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 07:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T418465)', diff saved to https://phabricator.wikimedia.org/P89281 and previous config saved to /var/cache/conftool/dbconfig/20260301-071201-marostegui.json
  • 07:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (T418465)', diff saved to https://phabricator.wikimedia.org/P89280 and previous config saved to /var/cache/conftool/dbconfig/20260301-071113-marostegui.json
  • 07:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T418465)', diff saved to https://phabricator.wikimedia.org/P89279 and previous config saved to /var/cache/conftool/dbconfig/20260301-071040-marostegui.json
  • 06:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P89278 and previous config saved to /var/cache/conftool/dbconfig/20260301-065653-marostegui.json
  • 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P89277 and previous config saved to /var/cache/conftool/dbconfig/20260301-065531-marostegui.json
  • 06:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P89276 and previous config saved to /var/cache/conftool/dbconfig/20260301-064145-marostegui.json
  • 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P89275 and previous config saved to /var/cache/conftool/dbconfig/20260301-064023-marostegui.json
  • 06:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T418465)', diff saved to https://phabricator.wikimedia.org/P89274 and previous config saved to /var/cache/conftool/dbconfig/20260301-062636-marostegui.json
  • 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T418465)', diff saved to https://phabricator.wikimedia.org/P89273 and previous config saved to /var/cache/conftool/dbconfig/20260301-062515-marostegui.json
  • 06:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T418465)', diff saved to https://phabricator.wikimedia.org/P89272 and previous config saved to /var/cache/conftool/dbconfig/20260301-062108-marostegui.json
  • 06:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1159.eqiad.wmnet with reason: Maintenance
  • 06:20 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T418465)', diff saved to https://phabricator.wikimedia.org/P89271 and previous config saved to /var/cache/conftool/dbconfig/20260301-062047-marostegui.json
  • 06:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Maintenance
  • 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 02:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 00s)
  • 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

Other archives

See Server Admin Log/Archives.