Jump to content

Server Admin Log/Archive 100

From Wikitech

2025-12-31

  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 00:22 wfan: civicrm upgraded from 03ff6ee3 to 9d26c426

2025-12-30

  • 18:48 moritzm: restarted Tomcat on idp2005
  • 18:40 jmm@dns1004: END - running authdns-update
  • 18:39 jmm@dns1004: START - running authdns-update
  • 10:20 jgleeson: payments-wiki upgraded from 81340350 to 857e80f2
  • 09:04 volans@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'clear' for AS: 18734
  • 09:04 volans@cumin1003: START - Cookbook sre.network.peering with action 'clear' for AS: 18734
  • 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 08s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-29

  • 01:19 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 18m 37s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-28

  • 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 12m 48s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-27

  • 01:24 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 24m 10s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-26

  • 19:43 cgoubert@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
  • 19:37 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
  • 19:30 andrewbogott: test message
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-25

  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-24

  • 15:29 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 15:29 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 15:26 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
  • 15:26 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: sync
  • 15:16 kamila@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
  • 15:10 kamila@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
  • 14:27 kamila@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-codfw
  • 14:21 kamila@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-codfw
  • 14:16 kamila@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
  • 14:09 kamila@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
  • 01:19 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 18m 56s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-23

  • 23:30 eileen: civicrm upgraded from 9cba6b6d to 03ff6ee3
  • 15:46 damilare: payments-wiki upgraded from 5c9a955f to 81340350
  • 15:15 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1328.eqiad.wmnet with OS trixie
  • 14:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1328.eqiad.wmnet with reason: host reimage
  • 14:51 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1328.eqiad.wmnet with reason: host reimage
  • 14:45 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1329.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 14:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1329.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 14:39 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1328.eqiad.wmnet with OS trixie
  • 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1364.eqiad.wmnet with OS trixie
  • 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
  • 14:23 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
  • 14:08 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1364.eqiad.wmnet with reason: host reimage
  • 14:04 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1364.eqiad.wmnet with reason: host reimage
  • 13:56 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:55 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:54 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:54 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:54 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:53 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:53 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1364.eqiad.wmnet with OS trixie
  • 13:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1364.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 13:42 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1364.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 13:42 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1364
  • 13:41 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1364
  • 13:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:41 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1364 - vriley@cumin1003"
  • 13:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1364 - vriley@cumin1003"
  • 13:40 urbanecm@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
  • 13:39 urbanecm@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
  • 13:37 vriley@cumin1003: START - Cookbook sre.dns.netbox
  • 13:15 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:14 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:14 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:14 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:14 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 13:14 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 13:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 13:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 12:43 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:43 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:43 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:42 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:42 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:41 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:27 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 12:27 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 11:57 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1363.eqiad.wmnet with OS trixie
  • 11:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1361.eqiad.wmnet with OS trixie
  • 11:16 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
  • 11:15 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
  • 11:12 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:12 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:11 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:09 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:09 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 10:59 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1361.eqiad.wmnet with reason: host reimage
  • 10:53 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1361.eqiad.wmnet with reason: host reimage
  • 10:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1361.eqiad.wmnet with OS trixie
  • 10:37 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1363.eqiad.wmnet with OS trixie
  • 09:37 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1363.eqiad.wmnet with OS trixie
  • 08:17 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1363.eqiad.wmnet with OS trixie
  • 08:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1363.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:11 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Resquito out of all services on: 1 hosts
  • 08:07 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Resquito out of all services on: 1 hosts
  • 08:07 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Resquito out of all services on: 1 hosts
  • 08:07 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Resquito out of all services on: 1 hosts
  • 08:07 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Resquito out of all services on: 1 hosts
  • 08:05 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1362.eqiad.wmnet with OS trixie
  • 08:05 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
  • 08:05 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
  • 08:04 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Resquito out of all services on: 2444 hosts
  • 08:03 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1363.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 08:03 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1363
  • 08:02 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1363
  • 08:02 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:02 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1363 - vriley@cumin1003"
  • 08:02 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1363 - vriley@cumin1003"
  • 07:58 vriley@cumin1003: START - Cookbook sre.dns.netbox
  • 07:51 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:49 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1362.eqiad.wmnet with reason: host reimage
  • 07:46 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1361
  • 07:45 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1361
  • 07:44 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:43 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1362.eqiad.wmnet with reason: host reimage
  • 07:41 vriley@cumin1003: START - Cookbook sre.dns.netbox
  • 07:39 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:37 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:36 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1362.eqiad.wmnet with OS trixie
  • 07:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1362.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:30 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:23 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:22 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:22 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1362.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:20 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1362
  • 07:19 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:19 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1362
  • 07:18 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:18 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1362 - vriley@cumin1003"
  • 07:18 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1362 - vriley@cumin1003"
  • 07:17 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:15 vriley@cumin1003: START - Cookbook sre.dns.netbox
  • 07:13 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:06 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1360.eqiad.wmnet with OS trixie
  • 07:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
  • 07:05 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 07:05 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1361
  • 07:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
  • 07:03 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1361
  • 07:03 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:03 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1361 - vriley@cumin1003"
  • 07:03 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1361 - vriley@cumin1003"
  • 06:58 vriley@cumin1003: START - Cookbook sre.dns.netbox
  • 06:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1360.eqiad.wmnet with reason: host reimage
  • 06:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1360.eqiad.wmnet with reason: host reimage
  • 06:34 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1360.eqiad.wmnet with OS trixie
  • 06:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1360.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 06:16 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1360.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 06:13 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1360
  • 06:12 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1360
  • 06:11 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 06:11 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1360 - vriley@cumin1003"
  • 06:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1360 - vriley@cumin1003"
  • 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.newpool (exit_code=0) pc1013 gradually with 4 steps - test
  • 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
  • 06:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
  • 06:10 marostegui@cumin1003: START - Cookbook sre.mysql.newpool pc1013 gradually with 4 steps - test
  • 06:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.newdepool (exit_code=0) pc1013 - test
  • 06:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
  • 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
  • 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.newdepool pc1013 - test
  • 06:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
  • 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es2028.codfw.wmnet
  • 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2028.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
  • 06:00 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2028.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
  • 05:56 marostegui@cumin1003: START - Cookbook sre.dns.netbox
  • 05:50 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts es2028.codfw.wmnet
  • 05:02 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.4 (duration: 02m 34s)

2025-12-22

  • 22:14 jgleeson: civicrm upgraded from 110aeb6d to 9cba6b6d
  • 21:20 eileen: civicrm upgraded from d678d34e to 110aeb6d
  • 20:30 eileen: civicrm upgraded from 4eee8c62 to d678d34e
  • 19:45 mforns@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
  • 19:45 mforns@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
  • 19:44 mforns@deploy2002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
  • 19:44 mforns@deploy2002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
  • 19:44 mforns@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
  • 19:44 mforns@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
  • 19:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1333.eqiad.wmnet with OS trixie
  • 19:43 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 19:42 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 19:41 sbisson@deploy2002: Finished scap sync-world: Backport for Fix section loading on desktop (T413305) (duration: 20m 44s)
  • 19:39 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1330.eqiad.wmnet with OS trixie
  • 19:39 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 19:39 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 19:34 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1331.eqiad.wmnet with OS trixie
  • 19:34 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 19:34 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 19:30 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1332.eqiad.wmnet with OS trixie
  • 19:30 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 19:30 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 19:28 sbisson@deploy2002: sbisson: Continuing with sync
  • 19:26 sbisson@deploy2002: sbisson: Backport for Fix section loading on desktop (T413305) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 19:26 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1333.eqiad.wmnet with reason: host reimage
  • 19:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1329.eqiad.wmnet with OS trixie
  • 19:23 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 19:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 19:22 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1330.eqiad.wmnet with reason: host reimage
  • 19:20 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1334.eqiad.wmnet with OS trixie
  • 19:20 sbisson@deploy2002: Started scap sync-world: Backport for Fix section loading on desktop (T413305)
  • 19:20 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 19:20 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 19:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1331.eqiad.wmnet with reason: host reimage
  • 19:14 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1332.eqiad.wmnet with reason: host reimage
  • 19:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1329.eqiad.wmnet with reason: host reimage
  • 19:03 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1334.eqiad.wmnet with reason: host reimage
  • 19:03 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1333.eqiad.wmnet with reason: host reimage
  • 19:03 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1332.eqiad.wmnet with reason: host reimage
  • 19:01 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1331.eqiad.wmnet with reason: host reimage
  • 19:01 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1330.eqiad.wmnet with reason: host reimage
  • 18:59 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1329.eqiad.wmnet with reason: host reimage
  • 18:58 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1334.eqiad.wmnet with reason: host reimage
  • 18:52 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1333.eqiad.wmnet with OS trixie
  • 18:52 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1332.eqiad.wmnet with OS trixie
  • 18:50 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1331.eqiad.wmnet with OS trixie
  • 18:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1328.eqiad.wmnet with OS trixie
  • 18:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 18:49 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1330.eqiad.wmnet with OS trixie
  • 18:49 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 18:49 sbisson@deploy2002: Started scap sync-world: Backport for Fix section loading on desktop (T413305)
  • 18:48 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1329.eqiad.wmnet with OS trixie
  • 18:47 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1334.eqiad.wmnet with OS trixie
  • 18:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1328.eqiad.wmnet with reason: host reimage
  • 18:28 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1328.eqiad.wmnet with reason: host reimage
  • 18:00 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1328.eqiad.wmnet with OS trixie
  • 17:50 sbisson@deploy2002: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.5,1.46.0-wmf.7,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/med
  • 17:46 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1334.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:35 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1334.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:35 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1333.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:30 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1032.eqiad.wmnet with OS trixie
  • 17:29 mforns@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 17:29 mforns@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 17:26 mforns@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 17:26 mforns@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 17:24 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1333.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1332.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:22 sbisson@deploy2002: Started scap sync-world: Backport for Fix section loading on desktop (T413305)
  • 17:13 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1332.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:11 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1331.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:00 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1331.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
  • 16:54 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1330.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
  • 16:49 tappof: lvextend /dev/vg0/srv on titan1001, titan1002, titan2002. T410152
  • 16:46 fabfur@dns1004: END - running authdns-update
  • 16:45 fabfur@dns1004: START - running authdns-update
  • 16:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1330.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1329.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:41 sbisson@deploy2002: Started scap sync-world: Backport for Fix section loading on desktop (T413305)
  • 16:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS trixie
  • 16:32 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1031.eqiad.wmnet with OS trixie
  • 16:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1329.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:26 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1328.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:22 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1330.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:22 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1331.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:21 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1331.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:21 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1330.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1328.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
  • 16:12 damilare: donorwiki upgraded from 14e22620 to 5c9a955f
  • 16:09 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
  • 16:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1334.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1333.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:08 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1334.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:08 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1333.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1332.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1331.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:07 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1332.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1331.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:06 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1330.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:06 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1329.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1330.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1329.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:59 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1328.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:58 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1328.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:53 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:53 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt wikikube1328-34 servers - jclark@cumin1003"
  • 15:53 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt wikikube1328-34 servers - jclark@cumin1003"
  • 15:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1031.eqiad.wmnet with OS trixie
  • 15:50 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1031.eqiad.wmnet with OS trixie
  • 15:49 jclark@cumin1003: START - Cookbook sre.dns.netbox
  • 15:48 akosiaris@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 15:48 akosiaris@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
  • 15:48 akosiaris@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:48 akosiaris@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 15:47 akosiaris@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 15:47 akosiaris@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 15:47 akosiaris@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:45 akosiaris@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 15:45 akosiaris@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 15:45 akosiaris@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 15:45 akosiaris@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 15:44 akosiaris: remove limits from kube-state-metrics in ml-serve-{eqiad,codfw} ml-staging-codfw dse-k8s-{eqiad,codfw} aux-k8s-{eqiad,codfw} kubernetes clusters. No point in resource limits for this workload, it's an important cluster component.
  • 15:44 akosiaris@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 15:44 akosiaris@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:44 akosiaris@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 15:39 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 15:38 akosiaris: remove limits from kube-state-metrics in wikikube and wikikube-staging clusters, no point in resource limits this workload, it's an important cluster component
  • 15:38 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 15:38 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 15:38 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 15:38 akosiaris@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 15:37 akosiaris@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 15:36 akosiaris@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:36 akosiaris@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
  • 14:53 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
  • 14:35 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1031.eqiad.wmnet with OS trixie
  • 14:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1030.eqiad.wmnet with OS trixie
  • 14:32 urandom: serveraction powercycle restbase2034 (down, unresponsive)
  • 14:24 jgleeson: payments-wiki upgraded from 4d41d604 to 5c9a955f
  • 14:18 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 14:18 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 14:15 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
  • 14:11 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
  • 13:53 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1030.eqiad.wmnet with OS trixie
  • 13:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1198 gradually with 4 steps - repooling
  • 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.newpool (exit_code=0) es2051 gradually with 4 steps - test T383674
  • 12:47 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1198 gradually with 4 steps - repooling
  • 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.newpool es2051 gradually with 4 steps - test T383674
  • 12:21 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 11:57 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 11:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1029.eqiad.wmnet with OS trixie
  • 11:43 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 11:38 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 11:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 11:14 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 11:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.newdepool (exit_code=0) es2051 - test T383674
  • 11:10 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 11:10 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
  • 11:04 fceratto@cumin1003: START - Cookbook sre.mysql.newdepool es2051 - test T383674
  • 10:37 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 10:35 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 10:28 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 10:28 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 10:28 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 10:27 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 10:23 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 10:23 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 10:18 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 10:07 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 10:05 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 10:04 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 09:31 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 09:02 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:47 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:47 elukey@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:28 elukey@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 08:14 elukey@deploy2002: Finished deploy [docker-pkg/deploy@1664255]: (no justification provided) (duration: 00m 08s)
  • 08:14 elukey@deploy2002: Started deploy [docker-pkg/deploy@1664255]: (no justification provided)
  • 08:04 elukey@deploy2002: Finished deploy [docker-pkg/deploy@1664255]: (no justification provided) (duration: 00m 07s)
  • 08:04 elukey@deploy2002: Started deploy [docker-pkg/deploy@1664255]: (no justification provided)
  • 08:03 elukey@deploy2002: Finished deploy [docker-pkg/deploy@1664255]: (no justification provided) (duration: 00m 11s)
  • 08:03 elukey@deploy2002: Started deploy [docker-pkg/deploy@1664255]: (no justification provided)
  • 07:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS trixie
  • 07:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
  • 07:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
  • 06:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-21

  • 23:48 eileen: config revision changed from e478c565 to 8e95f98e
  • 01:24 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 23m 45s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-20

  • 15:24 dzahn@dns1004: END - running authdns-update
  • 15:23 dzahn@dns1004: START - running authdns-update
  • 01:23 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 22m 39s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-19

  • 21:03 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:03 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:02 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:02 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:01 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:01 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:00 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:00 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 20:59 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 20:59 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 20:58 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 20:56 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 20:55 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2249
  • 20:54 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2249
  • 20:50 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:50 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2249 to codfw - jhancock@cumin1003"
  • 20:50 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2249 to codfw - jhancock@cumin1003"
  • 20:47 jhancock@cumin1003: START - Cookbook sre.dns.netbox
  • 18:16 mforns@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
  • 18:16 mforns@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
  • 18:16 mforns@deploy2002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
  • 18:15 mforns@deploy2002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
  • 18:15 mforns@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
  • 18:15 mforns@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
  • 17:45 cscott@deploy2002: Finished scap sync-world: Backport for Ensure that user interface language is "used" by postprocessing pipeline (T413227) (duration: 09m 07s)
  • 17:41 cscott@deploy2002: cscott: Continuing with sync
  • 17:38 cscott@deploy2002: cscott: Backport for Ensure that user interface language is "used" by postprocessing pipeline (T413227) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:36 cscott@deploy2002: Started scap sync-world: Backport for Ensure that user interface language is "used" by postprocessing pipeline (T413227)
  • 17:21 mforns@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 17:21 mforns@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 17:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 17:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 16:44 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1029.eqiad.wmnet with OS bookworm
  • 16:18 ejegg: civicrm upgraded from 878d168c to 4eee8c62
  • 16:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 16:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 15:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 15:41 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 15:21 elukey: restored the correct puppetserver1001's TLS certificate for puppet following https://phabricator.wikimedia.org/T405580#11214327
  • 15:20 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS bookworm
  • 15:07 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS bookworm
  • 15:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS bookworm
  • 13:50 mforns@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
  • 13:50 mforns@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
  • 13:50 mforns@deploy2002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
  • 13:50 mforns@deploy2002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
  • 13:48 mforns@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
  • 13:47 mforns@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
  • 13:41 jgleeson: payments-wiki upgraded from 14e22620 to 4d41d604
  • 13:39 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
  • 13:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:10 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 13:10 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 12:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 12:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 11:11 mforns@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
  • 11:11 mforns@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
  • 10:25 elukey@deploy2002: Finished deploy [docker-pkg/deploy@b6cc5ab]: (no justification provided) (duration: 00m 12s)
  • 10:25 elukey@deploy2002: Started deploy [docker-pkg/deploy@b6cc5ab]: (no justification provided)
  • 10:12 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 10:12 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 10:12 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 10:11 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 10:11 stran@deploy2002: Finished scap sync-world: Backport for Only show temp accounts on IP if temp accounts are known (T413139) (duration: 07m 37s)
  • 10:07 stran@deploy2002: mszwarc, stran: Continuing with sync
  • 10:05 stran@deploy2002: mszwarc, stran: Backport for Only show temp accounts on IP if temp accounts are known (T413139) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 10:03 stran@deploy2002: Started scap sync-world: Backport for Only show temp accounts on IP if temp accounts are known (T413139)
  • 10:00 elukey@deploy2002: Finished deploy [docker-pkg/deploy@1769f71]: (no justification provided) (duration: 00m 44s)
  • 09:59 elukey@deploy2002: Started deploy [docker-pkg/deploy@1769f71]: (no justification provided)
  • 09:56 moritzm: installing Linux 5.10.247 on Bullseye hosts
  • 09:21 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 09:21 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 08:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 138881
  • 08:50 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 138881
  • 08:44 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 8560
  • 08:41 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 8560
  • 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
  • 05:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 02:53 ejegg: payments-wiki upgraded from 8a207d81 to 14e22620
  • 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 04s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-18

  • 22:24 logmsgbot: mstyles Deployed security patch for T384147
  • 22:15 jhathaway: uploading corto 1.0.21
  • 22:11 cwhite@deploy2002: Finished deploy [statsv/statsv@0751b0b]: T383563 (duration: 00m 10s)
  • 22:11 cwhite@deploy2002: Started deploy [statsv/statsv@0751b0b]: T383563
  • 21:51 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Deploy: Various UI improvements - swfrench@cumin2002"
  • 21:51 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Various UI improvements - swfrench@cumin2002
  • 21:50 swfrench@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Various UI improvements - swfrench@cumin2002
  • 21:50 swfrench@cumin2002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Deploy: Various UI improvements - swfrench@cumin2002"
  • 21:47 eileen: civicrm upgraded from 0560cfd9 to 878d168c
  • 21:24 toyofuku@deploy2002: Finished scap sync-world: Backport for [Legal Footer] Deploy Legal Footer for Phase 1 wikis (T412455) (duration: 07m 04s)
  • 21:19 toyofuku@deploy2002: toyofuku, lmora: Continuing with sync
  • 21:19 toyofuku@deploy2002: toyofuku, lmora: Backport for [Legal Footer] Deploy Legal Footer for Phase 1 wikis (T412455) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:16 toyofuku@deploy2002: Started scap sync-world: Backport for [Legal Footer] Deploy Legal Footer for Phase 1 wikis (T412455)
  • 21:13 tgr@deploy2002: Finished scap sync-world: Backport for Remove LoggedOut cookie logic (T142542), Turn on Parsoid Read Views on itwiki (T413084), Logos: Handle missing responsive URLs, manually modify thumbnail sizes to avoid $wgThumbnailSteps (T405169) (duration: 06m 28s)
  • 21:09 tgr@deploy2002: pppery, tgr, cscott: Continuing with sync
  • 21:08 tgr@deploy2002: pppery, tgr, cscott: Backport for Remove LoggedOut cookie logic (T142542), Turn on Parsoid Read Views on itwiki (T413084), Logos: Handle missing responsive URLs, manually modify thumbnail sizes to avoid $wgThumbnailSteps (T405169) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:06 tgr@deploy2002: Started scap sync-world: Backport for Remove LoggedOut cookie logic (T142542), Turn on Parsoid Read Views on itwiki (T413084), Logos: Handle missing responsive URLs, manually modify thumbnail sizes to avoid $wgThumbnailSteps (T405169)
  • 20:49 dancy@deploy2002: Installation of scap version "4.230.0" completed for 1 hosts
  • 20:48 dancy@deploy2002: Installing scap version "4.230.0" for 1 host(s)
  • 20:47 dancy@deploy2002: Installation of scap version "4.230.0" completed for 2 hosts
  • 20:45 dancy@deploy2002: Installing scap version "4.230.0" for 2 host(s)
  • 19:10 dancy@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.7 refs T408277
  • 18:46 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 18:46 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 18:46 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 18:45 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 18:45 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 18:45 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 18:42 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 18:42 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 18:42 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 18:42 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 18:42 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 18:42 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 18:41 ejegg: donorwiki upgraded from 99671dda to 14e22620
  • 17:26 dreamyjazz@deploy2002: Finished scap sync-world: Backport for CheckUser: Set $wgCheckUserLogMaxRangeToShowInLog (T320769) (duration: 06m 46s)
  • 17:22 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 17:22 dreamyjazz@deploy2002: dreamyjazz: Backport for CheckUser: Set $wgCheckUserLogMaxRangeToShowInLog (T320769) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:19 dreamyjazz@deploy2002: Started scap sync-world: Backport for CheckUser: Set $wgCheckUserLogMaxRangeToShowInLog (T320769)
  • 16:43 elukey@deploy2002: Finished deploy [docker-pkg/deploy@a8e9cb3]: (no justification provided) (duration: 00m 15s)
  • 16:42 elukey@deploy2002: Started deploy [docker-pkg/deploy@a8e9cb3]: (no justification provided)
  • 16:27 elukey@deploy2002: Finished deploy [docker-pkg/deploy@a8e9cb3]: (no justification provided) (duration: 00m 12s)
  • 16:27 elukey@deploy2002: Started deploy [docker-pkg/deploy@a8e9cb3]: (no justification provided)
  • 15:05 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:03 cscott@deploy2002: Finished scap sync-world: Backport for Turn on Parsoid Read Views on nlwiki (T413084) (duration: 09m 12s)
  • 14:58 cscott@deploy2002: cscott: Continuing with sync
  • 14:56 cscott@deploy2002: cscott: Backport for Turn on Parsoid Read Views on nlwiki (T413084) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:54 cscott@deploy2002: Started scap sync-world: Backport for Turn on Parsoid Read Views on nlwiki (T413084)
  • 14:50 sgimeno@deploy2002: Finished scap sync-world: Backport for UserImpact: stop using pre-computed impact in the user impact job (T398500) (duration: 09m 31s)
  • 14:46 sgimeno@deploy2002: sgimeno: Continuing with sync
  • 14:43 sgimeno@deploy2002: sgimeno: Backport for UserImpact: stop using pre-computed impact in the user impact job (T398500) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:41 sgimeno@deploy2002: Started scap sync-world: Backport for UserImpact: stop using pre-computed impact in the user impact job (T398500)
  • 14:28 moritzm: installing rubygems security updates
  • 14:25 derick@deploy2002: Finished scap sync-world: Backport for Rest: Add more debug logging for `Resource::getProfile()` (T409901), Rest: Add more debug logging for `Resource::getProfile()` (T409901) (duration: 06m 54s)
  • 14:21 derick@deploy2002: d3r1ck01, derick: Continuing with sync
  • 14:20 derick@deploy2002: d3r1ck01, derick: Backport for Rest: Add more debug logging for `Resource::getProfile()` (T409901), Rest: Add more debug logging for `Resource::getProfile()` (T409901) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:18 derick@deploy2002: Started scap sync-world: Backport for Rest: Add more debug logging for `Resource::getProfile()` (T409901), Rest: Add more debug logging for `Resource::getProfile()` (T409901)
  • 14:14 stran@deploy2002: Finished scap sync-world: Backport for Revert^2 "Enable v2 non-emergency workflow by default" (duration: 08m 50s)
  • 14:10 stran@deploy2002: stran: Continuing with sync
  • 14:08 stran@deploy2002: stran: Backport for Revert^2 "Enable v2 non-emergency workflow by default" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:05 stran@deploy2002: Started scap sync-world: Backport for Revert^2 "Enable v2 non-emergency workflow by default"
  • 14:02 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 14:01 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 14:01 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 14:00 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 14:00 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 13:59 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 13:43 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
  • 13:14 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
  • 13:14 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
  • 13:14 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
  • 13:13 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
  • 13:13 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 13:12 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 12:53 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Joely Rooke WMDE out of all services on: 2435 hosts
  • 12:41 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
  • 11:22 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
  • 11:21 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
  • 09:36 Emperor: restart swift-container-sync on ms-be2081 T413008
  • 08:56 ammarpad@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki 'Anurag Bhattamishra' 'Renamed user d198c4f693b15534f61d97349d9d7d8e' # T413036
  • 07:55 moritzm: bounced slapd on serpens after cleaninp up a failed logrotate
  • 05:45 musikanimal@deploy2002: Finished scap sync-world: Backport for codemirror.less: order the gutters (T412884), CodeMirror: disable spellcheck for non-wikitext (T412848), extension.json: make activeLine on by default for non-wikitext (T412886), CodeMirrorJavaScript: better descriptions for ESLint suggestions (duration: 12m 04s)
  • 05:41 musikanimal@deploy2002: musikanimal: Continuing with sync
  • 05:35 musikanimal@deploy2002: musikanimal: Backport for codemirror.less: order the gutters (T412884), CodeMirror: disable spellcheck for non-wikitext (T412848), extension.json: make activeLine on by default for non-wikitext (T412886), CodeMirrorJavaScript: better descriptions for ESLint suggestions synced to the testservers (see https://wikitech.wik
  • 05:33 musikanimal@deploy2002: Started scap sync-world: Backport for codemirror.less: order the gutters (T412884), CodeMirror: disable spellcheck for non-wikitext (T412848), extension.json: make activeLine on by default for non-wikitext (T412886), CodeMirrorJavaScript: better descriptions for ESLint suggestions
  • 04:03 eileen: civicrm upgraded from 12b8fa9d to 0560cfd9
  • 02:54 musikanimal@deploy2002: Finished scap sync-world: Backport for Use CodeMirror instead of CodeEditor for beta feature users + vue mode (T373711) (duration: 07m 15s)
  • 02:50 musikanimal@deploy2002: musikanimal: Continuing with sync
  • 02:49 musikanimal@deploy2002: musikanimal: Backport for Use CodeMirror instead of CodeEditor for beta feature users + vue mode (T373711) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 02:47 musikanimal@deploy2002: Started scap sync-world: Backport for Use CodeMirror instead of CodeEditor for beta feature users + vue mode (T373711)
  • 01:24 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 23m 23s)
  • 01:01 rzl@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
  • 01:01 rzl@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
  • 01:01 rzl@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 01:00 rzl@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
  • 00:56 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 00:56 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 00:56 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 00:55 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 00:55 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
  • 00:54 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
  • 00:54 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
  • 00:54 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
  • 00:54 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
  • 00:53 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/toolhub: apply
  • 00:53 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/termbox: apply
  • 00:52 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/termbox: apply
  • 00:50 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 00:50 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
  • 00:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
  • 00:49 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
  • 00:48 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
  • 00:48 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
  • 00:48 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 00:48 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 00:47 cstone: civicrm upgraded from 28ef5eb1 to 12b8fa9d
  • 00:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
  • 00:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
  • 00:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
  • 00:45 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
  • 00:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
  • 00:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
  • 00:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 00:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 00:43 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 00:43 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 00:42 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 00:40 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 00:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
  • 00:37 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
  • 00:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
  • 00:19 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
  • 00:19 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 00:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 00:18 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/image-suggestion: apply
  • 00:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/image-suggestion: apply
  • 00:17 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
  • 00:17 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
  • 00:17 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
  • 00:15 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
  • 00:15 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 00:14 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 00:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
  • 00:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
  • 00:13 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
  • 00:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
  • 00:12 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 00:12 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
  • 00:12 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
  • 00:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
  • 00:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
  • 00:10 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
  • 00:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 00:10 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 00:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/echostore: apply
  • 00:09 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/echostore: apply
  • 00:08 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 00:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 00:08 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 00:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 00:07 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
  • 00:07 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
  • 00:07 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/commons-impact-analytics: apply
  • 00:06 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/commons-impact-analytics: apply
  • 00:05 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:05 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:05 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
  • 00:04 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
  • 00:04 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
  • 00:03 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/apertium: apply

2025-12-17

  • 23:52 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
  • 23:52 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/zotero: apply
  • 23:52 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 23:51 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 23:51 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
  • 23:50 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
  • 23:49 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
  • 23:49 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
  • 23:48 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
  • 23:48 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
  • 23:47 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 23:47 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 23:46 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
  • 23:45 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/termbox: apply
  • 23:45 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 23:45 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
  • 23:44 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
  • 23:44 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
  • 23:43 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
  • 23:43 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
  • 23:43 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 23:43 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 23:42 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
  • 23:42 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
  • 23:42 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
  • 23:41 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
  • 23:36 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
  • 23:35 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
  • 23:35 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 23:35 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 23:34 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 23:34 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 23:33 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 23:31 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 23:31 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
  • 23:31 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
  • 23:30 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
  • 23:18 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
  • 23:14 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
  • 23:13 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
  • 23:13 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 23:13 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 23:12 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/image-suggestion: apply
  • 23:12 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/image-suggestion: apply
  • 23:12 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
  • 23:11 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
  • 23:11 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 23:10 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 23:10 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 23:10 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 23:09 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 23:09 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 23:09 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
  • 23:08 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
  • 23:08 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 23:08 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
  • 23:08 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
  • 23:07 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
  • 23:07 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
  • 23:06 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
  • 23:06 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 23:06 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 23:05 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
  • 23:04 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/echostore: apply
  • 23:04 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 23:04 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 23:04 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 23:03 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 23:03 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
  • 23:03 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
  • 23:02 eileen: config revision changed from 7d6ad875 to e478c565
  • 22:59 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/commons-impact-analytics: apply
  • 22:59 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/commons-impact-analytics: apply
  • 22:58 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:58 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:57 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
  • 22:56 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
  • 22:55 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
  • 22:55 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/apertium: apply
  • 22:53 jhathaway: upload new version of corto
  • 22:48 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 22:47 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 22:47 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 22:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 22:46 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 22:45 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 22:45 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 22:45 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 22:43 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 22:43 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 22:41 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 22:31 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 22:30 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 22:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 22:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 22:17 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 22:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 22:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 22:15 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 22:15 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 22:14 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 22:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 22:14 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 22:13 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 22:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 22:12 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 21:56 egardner@deploy2002: Finished scap sync-world: Backport for Delay StickyHeaders section click instrumentation for slow loads (T412857) (duration: 07m 47s)
  • 21:52 egardner@deploy2002: egardner: Continuing with sync
  • 21:50 egardner@deploy2002: egardner: Backport for Delay StickyHeaders section click instrumentation for slow loads (T412857) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:48 egardner@deploy2002: Started scap sync-world: Backport for Delay StickyHeaders section click instrumentation for slow loads (T412857)
  • 21:36 cscott@deploy2002: Finished scap sync-world: Backport for Enable post-processing cache for all Parsoid-rendered wikis (T348255), Decommission Article Summaries (T411558) (duration: 12m 13s)
  • 21:32 cscott@deploy2002: ksarabia, ihurbain, cscott: Continuing with sync
  • 21:26 cscott@deploy2002: ksarabia, ihurbain, cscott: Backport for Enable post-processing cache for all Parsoid-rendered wikis (T348255), Decommission Article Summaries (T411558) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:23 cscott@deploy2002: Started scap sync-world: Backport for Enable post-processing cache for all Parsoid-rendered wikis (T348255), Decommission Article Summaries (T411558)
  • 21:18 cscott@deploy2002: Finished scap sync-world: Backport for ParserOutputAccess: don't use PoolCounter recursively (T412959), ParserOutputAccess: don't use PoolCounter recursively (T412959) (duration: 08m 50s)
  • 21:15 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2239.codfw.wmnet with reason: Maintenance
  • 21:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T410589)', diff saved to https://phabricator.wikimedia.org/P86728 and previous config saved to /var/cache/conftool/dbconfig/20251217-211537-ladsgroup.json
  • 21:14 cscott@deploy2002: cscott: Continuing with sync
  • 21:11 cscott@deploy2002: cscott: Backport for ParserOutputAccess: don't use PoolCounter recursively (T412959), ParserOutputAccess: don't use PoolCounter recursively (T412959) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:09 cscott@deploy2002: Started scap sync-world: Backport for ParserOutputAccess: don't use PoolCounter recursively (T412959), ParserOutputAccess: don't use PoolCounter recursively (T412959)
  • 21:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P86727 and previous config saved to /var/cache/conftool/dbconfig/20251217-210029-ladsgroup.json
  • 20:52 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 20:52 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 20:51 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 20:51 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 20:51 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
  • 20:50 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
  • 20:50 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 20:50 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 20:49 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
  • 20:49 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 20:49 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
  • 20:49 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 20:49 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 20:48 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 20:48 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 20:47 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 20:45 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P86726 and previous config saved to /var/cache/conftool/dbconfig/20251217-204520-ladsgroup.json
  • 20:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T410589)', diff saved to https://phabricator.wikimedia.org/P86725 and previous config saved to /var/cache/conftool/dbconfig/20251217-203012-ladsgroup.json
  • 19:11 dancy@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.7 refs T408277
  • 18:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 18:48 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
  • 18:48 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
  • 18:47 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
  • 18:46 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
  • 18:42 swfrench@deploy2002: Finished scap sync-world: Rebuild deployment to pick up new production image (duration: 78m 01s)
  • 18:32 cmooney@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2028.codfw.wmnet with OS trixie
  • 17:54 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
  • 17:51 topranks: upgrading OS on lswtest-d8-eqiad T412733
  • 17:51 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-d[1,8]-eqiad with reason: upgradiing sr-linux on lswtest-d8-eqiad
  • 17:50 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-d[1,8]-eqiad.mgmt with reason: upgradiing sr-linux on lswtest-d8-eqiad
  • 17:46 cmooney@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2028.codfw.wmnet with OS trixie
  • 17:34 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1006.eqiad.wmnet with reason: upgrading connected switch
  • 17:33 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lswtest-d8-eqiad,lswtest-d8-eqiad IPv6 with reason: upgradiing sr-linux on lswtest-d8-eqiad
  • 17:28 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host es2028
  • 17:28 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host es2028
  • 17:28 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
  • 17:27 swfrench@deploy2002: Started scap sync-world: Rebuild deployment to pick up new production image
  • 17:24 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
  • 17:12 swfrench-wmf: reprepro include php8.3_8.3.28-1+wmf11u2 in component/php83
  • 17:08 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7009.*
  • 17:04 fabfur: enabling puppet and repooling cp7009 (T412785)
  • 16:38 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1031.eqiad.wmnet
  • 16:31 eevans@cumin1003: START - Cookbook sre.hosts.reboot-single for host restbase1031.eqiad.wmnet
  • 15:50 eevans@cumin1003: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=restbase,service=restbase-ssl
  • 15:50 eevans@cumin1003: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=restbase,service=restbase-https
  • 15:49 eevans@cumin1003: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=restbase,service=restbase-backend
  • 15:45 eevans@cumin1003: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=restbase,service=restbase-*
  • 15:28 moritzm: upgrade Envoy on etherpad* T410975
  • 15:12 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1031.eqiad.wmnet on all recursors
  • 15:12 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache restbase1031.eqiad.wmnet on all recursors
  • 15:12 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:12 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA to restbase1031 - ayounsi@cumin1003"
  • 15:11 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA to restbase1031 - ayounsi@cumin1003"
  • 15:11 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host es2028
  • 15:11 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host es2028
  • 15:11 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
  • 15:07 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
  • 15:06 XioNoX: add AAAA record to restbase1031.eqiad.wmnet - T271140
  • 15:05 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 15:05 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 15:04 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:03 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 15:03 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 15:01 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 15:01 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 14:59 moritzm: installing nodejs security updates
  • 14:53 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 14:53 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 14:51 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Revert "Enable v2 non-emergency workflow by default" (T410512 T412715), Activate post-processing cache on some wikis (T348255) (duration: 18m 45s)
  • 14:50 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 14:50 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 14:47 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, ihurbain: Continuing with sync
  • 14:41 moritzm: installing tiff security updates
  • 14:35 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, ihurbain: Backport for Revert "Enable v2 non-emergency workflow by default" (T410512 T412715), Activate post-processing cache on some wikis (T348255) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:33 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Revert "Enable v2 non-emergency workflow by default" (T410512 T412715), Activate post-processing cache on some wikis (T348255)
  • 14:29 lucaswerkmeister-wmde@deploy2002: Sync cancelled.
  • 14:22 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 14:22 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 14:19 lucaswerkmeister-wmde@deploy2002: stran, lucaswerkmeister-wmde: Backport for Enable v2 non-emergency workflow by default (T410512 T412715) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:18 moritzm: installing redis security updates
  • 14:16 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Enable v2 non-emergency workflow by default (T410512 T412715)
  • 14:11 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for lift throttle limits for Sing Lit 2025 (T412820) (duration: 07m 10s)
  • 14:09 moritzm: installing pdns-recursor security updates
  • 14:07 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, robertsky: Continuing with sync
  • 14:06 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, robertsky: Backport for lift throttle limits for Sing Lit 2025 (T412820) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:04 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for lift throttle limits for Sing Lit 2025 (T412820)
  • 13:53 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 13:52 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 13:47 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 13:45 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 13:44 moritzm: upgtrade Envoy on grafana* T410975
  • 13:36 moritzm: installing apache2 security updates
  • 13:32 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 13:29 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 13:27 moritzm: upgtrade Envoy on an-web T410975
  • 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86724 and previous config saved to /var/cache/conftool/dbconfig/20251217-121556-marostegui.json
  • 12:15 moritzm: installing pam security updates
  • 12:07 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 12:06 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 12:04 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 12:04 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 12:02 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:01 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 12:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P86723 and previous config saved to /var/cache/conftool/dbconfig/20251217-120047-marostegui.json
  • 11:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P86722 and previous config saved to /var/cache/conftool/dbconfig/20251217-114539-marostegui.json
  • 11:42 moritzm: installing libsndfile security updates
  • 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86721 and previous config saved to /var/cache/conftool/dbconfig/20251217-113031-marostegui.json
  • 11:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86720 and previous config saved to /var/cache/conftool/dbconfig/20251217-112818-marostegui.json
  • 11:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2222.codfw.wmnet with reason: Maintenance
  • 11:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86719 and previous config saved to /var/cache/conftool/dbconfig/20251217-112805-marostegui.json
  • 11:23 Amir1: dropped "trash" and "percona" databases in x1
  • 11:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P86718 and previous config saved to /var/cache/conftool/dbconfig/20251217-111257-marostegui.json
  • 11:04 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1029.eqiad.wmnet with OS trixie
  • 10:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P86717 and previous config saved to /var/cache/conftool/dbconfig/20251217-105748-marostegui.json
  • 10:51 moritzm: installing libssh security updates
  • 10:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86716 and previous config saved to /var/cache/conftool/dbconfig/20251217-104240-marostegui.json
  • 10:37 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: apply
  • 10:36 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: apply
  • 10:35 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
  • 10:34 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
  • 10:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 10:33 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
  • 10:33 jmm@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
  • 10:26 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 10:08 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
  • 10:07 kart_: Updated cxserver to 2025-12-15-140202-production
  • 09:59 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 09:59 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 09:55 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 09:54 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 09:32 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7009.*
  • 09:32 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7009.*
  • 09:28 fabfur: depool and disable puppet on cp7009 for haproxy qos testing (T412785)
  • 09:18 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 09:18 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 09:14 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 09:13 moritzm: installing nginx security updates
  • 09:13 jelto@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 09:12 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 09:09 jelto@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 09:07 jelto@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:06 jelto@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:05 jelto@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:04 jelto@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 08:41 elukey@deploy2002: Finished deploy [docker-pkg/deploy@4533f76]: Deploy docker-pkg (duration: 01m 08s)
  • 08:40 elukey@deploy2002: Started deploy [docker-pkg/deploy@4533f76]: Deploy docker-pkg
  • 08:26 moritzm: installing jq security updates
  • 08:13 akosiaris@deploy2002: Finished scap sync-world: Backport for Update fc-list to point to fc-list Tool (T280718) (duration: 08m 22s)
  • 08:08 akosiaris@deploy2002: akosiaris: Continuing with sync
  • 08:07 akosiaris@deploy2002: akosiaris: Backport for Update fc-list to point to fc-list Tool (T280718) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:04 akosiaris@deploy2002: Started scap sync-world: Backport for Update fc-list to point to fc-list Tool (T280718)
  • 06:07 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86714 and previous config saved to /var/cache/conftool/dbconfig/20251217-060706-marostegui.json
  • 06:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2221.codfw.wmnet with reason: Maintenance
  • 06:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86713 and previous config saved to /var/cache/conftool/dbconfig/20251217-060641-marostegui.json
  • 05:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P86712 and previous config saved to /var/cache/conftool/dbconfig/20251217-055133-marostegui.json
  • 05:42 eileen: civicrm upgraded from a0d1f1f7 to 28ef5eb1
  • 05:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P86711 and previous config saved to /var/cache/conftool/dbconfig/20251217-053625-marostegui.json
  • 05:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2222.codfw.wmnet with reason: schema change
  • 05:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86710 and previous config saved to /var/cache/conftool/dbconfig/20251217-052117-marostegui.json
  • 05:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 05:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86709 and previous config saved to /var/cache/conftool/dbconfig/20251217-051509-marostegui.json
  • 05:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P86708 and previous config saved to /var/cache/conftool/dbconfig/20251217-050001-marostegui.json
  • 04:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P86707 and previous config saved to /var/cache/conftool/dbconfig/20251217-044453-marostegui.json
  • 04:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86706 and previous config saved to /var/cache/conftool/dbconfig/20251217-042943-marostegui.json
  • 04:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1253 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86705 and previous config saved to /var/cache/conftool/dbconfig/20251217-042733-marostegui.json
  • 04:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1253.eqiad.wmnet with reason: Maintenance
  • 04:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86704 and previous config saved to /var/cache/conftool/dbconfig/20251217-042708-marostegui.json
  • 04:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P86703 and previous config saved to /var/cache/conftool/dbconfig/20251217-041200-marostegui.json
  • 03:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P86702 and previous config saved to /var/cache/conftool/dbconfig/20251217-035651-marostegui.json
  • 03:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86701 and previous config saved to /var/cache/conftool/dbconfig/20251217-034143-marostegui.json
  • 02:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2227 (T410589)', diff saved to https://phabricator.wikimedia.org/P86700 and previous config saved to /var/cache/conftool/dbconfig/20251217-025900-ladsgroup.json
  • 02:58 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
  • 02:58 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T410589)', diff saved to https://phabricator.wikimedia.org/P86699 and previous config saved to /var/cache/conftool/dbconfig/20251217-025835-ladsgroup.json
  • 02:43 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P86698 and previous config saved to /var/cache/conftool/dbconfig/20251217-024326-ladsgroup.json
  • 02:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1231 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86697 and previous config saved to /var/cache/conftool/dbconfig/20251217-024127-marostegui.json
  • 02:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 02:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86696 and previous config saved to /var/cache/conftool/dbconfig/20251217-024103-marostegui.json
  • 02:28 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P86695 and previous config saved to /var/cache/conftool/dbconfig/20251217-022818-ladsgroup.json
  • 02:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P86694 and previous config saved to /var/cache/conftool/dbconfig/20251217-022554-marostegui.json
  • 02:13 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T410589)', diff saved to https://phabricator.wikimedia.org/P86693 and previous config saved to /var/cache/conftool/dbconfig/20251217-021310-ladsgroup.json
  • 02:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P86692 and previous config saved to /var/cache/conftool/dbconfig/20251217-021046-marostegui.json
  • 01:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86691 and previous config saved to /var/cache/conftool/dbconfig/20251217-015538-marostegui.json
  • 01:25 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 24m 10s)
  • 01:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 00:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 00:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 00:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 00:57 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 00:57 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 00:57 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 00:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
  • 00:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
  • 00:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
  • 00:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/toolhub: apply
  • 00:50 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
  • 00:50 rzl@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
  • 00:50 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 00:49 rzl@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
  • 00:49 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 00:49 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 00:49 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 00:48 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 00:48 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 00:48 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 00:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2220 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86690 and previous config saved to /var/cache/conftool/dbconfig/20251217-004659-marostegui.json
  • 00:46 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 00:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 00:46 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 00:46 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86689 and previous config saved to /var/cache/conftool/dbconfig/20251217-004634-marostegui.json
  • 00:46 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 00:46 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 00:45 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 00:45 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
  • 00:45 rzl@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
  • 00:43 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
  • 00:43 rzl@deploy2002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
  • 00:43 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 00:43 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 00:43 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
  • 00:43 rzl@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
  • 00:43 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
  • 00:42 rzl@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
  • 00:42 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
  • 00:42 rzl@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
  • 00:41 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 00:41 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 00:39 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 00:39 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 00:39 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 00:38 rzl@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 00:38 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
  • 00:37 rzl@deploy2002: helmfile [staging] START helmfile.d/services/media-analytics: apply
  • 00:37 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 00:34 rzl@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 00:33 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
  • 00:32 rzl@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
  • 00:31 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
  • 00:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P86688 and previous config saved to /var/cache/conftool/dbconfig/20251217-003126-marostegui.json
  • 00:30 rzl@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
  • 00:30 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 00:30 rzl@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 00:29 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
  • 00:29 rzl@deploy2002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
  • 00:29 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
  • 00:28 rzl@deploy2002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
  • 00:28 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
  • 00:28 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
  • 00:28 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 00:27 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 00:27 eileen: civicrm upgraded from 000ff848 to a0d1f1f7
  • 00:27 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 00:27 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 00:27 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
  • 00:27 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
  • 00:26 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 00:26 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
  • 00:26 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
  • 00:26 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
  • 00:25 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 00:25 rzl@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 00:25 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 00:25 rzl@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 00:24 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/echostore: apply
  • 00:24 rzl@deploy2002: helmfile [staging] START helmfile.d/services/echostore: apply
  • 00:24 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 00:24 rzl@deploy2002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 00:23 rzl@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
  • 00:23 rzl@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
  • 00:22 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 00:22 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply
  • 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply
  • 00:21 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 00:18 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/apertium: apply
  • 00:17 rzl@deploy2002: helmfile [staging] START helmfile.d/services/apertium: apply
  • 00:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P86687 and previous config saved to /var/cache/conftool/dbconfig/20251217-001617-marostegui.json
  • 00:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86686 and previous config saved to /var/cache/conftool/dbconfig/20251217-000109-marostegui.json

2025-12-16

  • 23:40 egardner@deploy2002: Finished scap sync-world: Backport for [Moderator tools] Add data-mw-interface in addition to data-mw="interface" (T409187), Delay StickyHeaders section click instrumentation for slow loads (T412857) (duration: 11m 47s)
  • 23:34 egardner@deploy2002: jsn, egardner: Continuing with sync
  • 23:32 egardner@deploy2002: jsn, egardner: Backport for [Moderator tools] Add data-mw-interface in addition to data-mw="interface" (T409187), Delay StickyHeaders section click instrumentation for slow loads (T412857) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:28 egardner@deploy2002: Started scap sync-world: Backport for [Moderator tools] Add data-mw-interface in addition to data-mw="interface" (T409187), Delay StickyHeaders section click instrumentation for slow loads (T412857)
  • 23:04 jsn@deploy2002: Finished scap sync-world: Backport for product_metrics.special_create_account: Collect mediawiki_database (T412866) (duration: 50m 45s)
  • 22:51 jsn@deploy2002: kharlan, jsn: Continuing with sync
  • 22:50 jsn@deploy2002: kharlan, jsn: Backport for product_metrics.special_create_account: Collect mediawiki_database (T412866) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 22:41 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
  • 22:40 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
  • 22:13 jsn@deploy2002: Started scap sync-world: Backport for product_metrics.special_create_account: Collect mediawiki_database (T412866)
  • 22:02 jsn@deploy2002: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.5,1.46.0-wmf.7,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/mediawi
  • 21:58 Amir1: mwscript-k8s --follow -- findBadBlobs.php --wiki elwiki --mark "Corrupted UTF-8 (T351953)" --revisions 26381,30551 (T351953)
  • 21:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86684 and previous config saved to /var/cache/conftool/dbconfig/20251216-211743-marostegui.json
  • 21:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 21:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86683 and previous config saved to /var/cache/conftool/dbconfig/20251216-211718-marostegui.json
  • 21:08 jsn@deploy2002: Started scap sync-world: Backport for product_metrics.special_create_account: Collect mediawiki_database (T412866)
  • 21:06 eileen: civicrm upgraded from 03479639 to 000ff848
  • 21:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P86682 and previous config saved to /var/cache/conftool/dbconfig/20251216-210210-marostegui.json
  • 20:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P86681 and previous config saved to /var/cache/conftool/dbconfig/20251216-204701-marostegui.json
  • 20:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86680 and previous config saved to /var/cache/conftool/dbconfig/20251216-203153-marostegui.json
  • 20:17 dzahn@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 20:16 dzahn@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 19:45 cstone: SmashPig upgraded from 5c731f99 to 631fff60
  • 19:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86678 and previous config saved to /var/cache/conftool/dbconfig/20251216-192603-marostegui.json
  • 19:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 19:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 19:23 ryankemper@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on an-worker1148.eqiad.wmnet with reason: T411919
  • 19:18 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.7 refs T408277
  • 19:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1202 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86676 and previous config saved to /var/cache/conftool/dbconfig/20251216-191759-marostegui.json
  • 19:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 19:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86675 and previous config saved to /var/cache/conftool/dbconfig/20251216-191733-marostegui.json
  • 19:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P86674 and previous config saved to /var/cache/conftool/dbconfig/20251216-190225-marostegui.json
  • 18:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P86673 and previous config saved to /var/cache/conftool/dbconfig/20251216-184717-marostegui.json
  • 18:38 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
  • 18:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86672 and previous config saved to /var/cache/conftool/dbconfig/20251216-183208-marostegui.json
  • 17:59 tappof: Cleaned up old files (not deleted by logrotate) on centrallog1002; removed the rsyslog-debug file on centrallog1002.
  • 17:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1194 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86671 and previous config saved to /var/cache/conftool/dbconfig/20251216-171841-marostegui.json
  • 17:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 17:18 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host es2028
  • 17:18 cmooney@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es2028
  • 17:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86670 and previous config saved to /var/cache/conftool/dbconfig/20251216-171816-marostegui.json
  • 17:18 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host es2028
  • 17:18 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) es2028.codfw.wmnet 140.0.192.10.in-addr.arpa 0.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 17:18 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache es2028.codfw.wmnet 140.0.192.10.in-addr.arpa 0.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 17:18 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host es2028 - cmooney@cumin1003"
  • 17:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host es2028 - cmooney@cumin1003"
  • 17:14 cmooney@cumin1003: START - Cookbook sre.dns.netbox
  • 17:14 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host es2028
  • 17:14 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
  • 17:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P86669 and previous config saved to /var/cache/conftool/dbconfig/20251216-170308-marostegui.json
  • 17:01 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikibooks --logwiki=metawiki Magiuser 'Renamed user f3a49d320a6984a0d6b403d313476916' # T412784
  • 16:54 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 16:54 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 16:54 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 16:53 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 16:52 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 16:52 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 16:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P86668 and previous config saved to /var/cache/conftool/dbconfig/20251216-164800-marostegui.json
  • 16:47 moritzm: installing unbound security updates
  • 16:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86667 and previous config saved to /var/cache/conftool/dbconfig/20251216-163252-marostegui.json
  • 16:18 brett@dns1006: END - running authdns-update
  • 16:15 brett@dns1006: START - running authdns-update
  • 16:04 brennen@deploy2002: Finished deploy [phabricator/deployment@3a23687]: deploy phab1004 for T412825 (duration: 00m 58s)
  • 16:03 brennen@deploy2002: Started deploy [phabricator/deployment@3a23687]: deploy phab1004 for T412825
  • 16:03 brennen@deploy2002: Finished deploy [phabricator/deployment@3a23687]: deploy phab2002 for T412825 (duration: 00m 31s)
  • 16:03 brennen@deploy2002: Started deploy [phabricator/deployment@3a23687]: deploy phab2002 for T412825
  • 16:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
  • 16:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
  • 15:47 hashar: Restarting CI Jenkins
  • 15:46 jmm@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:45 jmm@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 15:30 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: end frwiki A/B test (T405239) (duration: 13m 26s)
  • 15:26 gehel: cleanup temp files on archiva1002
  • 15:26 kharlan@deploy2002: kharlan: Continuing with sync
  • 15:25 ejegg: payments-wiki upgraded from 8db01377 to 8a207d81
  • 15:18 kharlan@deploy2002: kharlan: Backport for hCaptcha: end frwiki A/B test (T405239) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1191 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86665 and previous config saved to /var/cache/conftool/dbconfig/20251216-151834-marostegui.json
  • 15:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86664 and previous config saved to /var/cache/conftool/dbconfig/20251216-151809-marostegui.json
  • 15:16 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: end frwiki A/B test (T405239)
  • 15:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 15:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 15:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P86663 and previous config saved to /var/cache/conftool/dbconfig/20251216-150301-marostegui.json
  • 14:57 Dreamy_Jazz: Afternoon UTC backport window done
  • 14:56 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Pin $wgCheckUserUserAgentTableMigrationStage as SCHEMA_COMPAT_OLD (T361173) (duration: 06m 55s)
  • 14:52 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 14:52 dreamyjazz@deploy2002: dreamyjazz: Backport for Pin $wgCheckUserUserAgentTableMigrationStage as SCHEMA_COMPAT_OLD (T361173) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:49 dreamyjazz@deploy2002: Started scap sync-world: Backport for Pin $wgCheckUserUserAgentTableMigrationStage as SCHEMA_COMPAT_OLD (T361173)
  • 14:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P86662 and previous config saved to /var/cache/conftool/dbconfig/20251216-144752-marostegui.json
  • 14:47 sbisson@deploy2002: Finished scap sync-world: Backport for CX3 Build 1.0.0+20251215 (T408842 T411779) (duration: 07m 27s)
  • 14:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 14:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86661 and previous config saved to /var/cache/conftool/dbconfig/20251216-144533-marostegui.json
  • 14:43 sbisson@deploy2002: sbisson: Continuing with sync
  • 14:42 sbisson@deploy2002: sbisson: Backport for CX3 Build 1.0.0+20251215 (T408842 T411779) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:39 sbisson@deploy2002: Started scap sync-world: Backport for CX3 Build 1.0.0+20251215 (T408842 T411779)
  • 14:37 sbisson@deploy2002: Finished scap sync-world: Backport for svwiki: lift autoconfirmed setting (T412713) (duration: 09m 49s)
  • 14:33 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1029.eqiad.wmnet with OS trixie
  • 14:33 sbisson@deploy2002: sbisson, hamishz: Continuing with sync
  • 14:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86660 and previous config saved to /var/cache/conftool/dbconfig/20251216-143244-marostegui.json
  • 14:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P86659 and previous config saved to /var/cache/conftool/dbconfig/20251216-143025-marostegui.json
  • 14:29 sbisson@deploy2002: sbisson, hamishz: Backport for svwiki: lift autoconfirmed setting (T412713) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:29 moritzm: installing glibc security updates
  • 14:27 sbisson@deploy2002: Started scap sync-world: Backport for svwiki: lift autoconfirmed setting (T412713)
  • 14:26 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
  • 14:25 sbisson@deploy2002: Finished scap sync-world: Backport for zhwiki: enable protection indicators (T412710) (duration: 08m 05s)
  • 14:24 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
  • 14:21 sbisson@deploy2002: sbisson, hamishz: Continuing with sync
  • 14:19 sbisson@deploy2002: sbisson, hamishz: Backport for zhwiki: enable protection indicators (T412710) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:17 sbisson@deploy2002: Started scap sync-world: Backport for zhwiki: enable protection indicators (T412710)
  • 14:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P86658 and previous config saved to /var/cache/conftool/dbconfig/20251216-141517-marostegui.json
  • 14:13 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
  • 14:12 sbisson@deploy2002: Finished scap sync-world: Backport for core-Permission: Add abusefilter-access-protected-vars to temporary-account-viewer in jawiki (T412791) (duration: 07m 15s)
  • 14:11 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
  • 14:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2003.codfw.wmnet
  • 14:10 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['es2028.codfw.wmnet']
  • 14:09 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 14:08 sbisson@deploy2002: bunnypranav, sbisson: Continuing with sync
  • 14:08 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
  • 14:07 ayounsi@cumin1003: START - Cookbook sre.hosts.reboot-single for host sretest2003.codfw.wmnet
  • 14:07 sbisson@deploy2002: bunnypranav, sbisson: Backport for core-Permission: Add abusefilter-access-protected-vars to temporary-account-viewer in jawiki (T412791) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:06 jmm@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
  • 14:05 sbisson@deploy2002: Started scap sync-world: Backport for core-Permission: Add abusefilter-access-protected-vars to temporary-account-viewer in jawiki (T412791)
  • 14:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2003.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:04 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2028.codfw.wmnet']
  • 14:04 ayounsi@cumin1003: START - Cookbook sre.hosts.provision for host sretest2003.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:04 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['es2028.codfw.wmnet']
  • 14:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 14:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86657 and previous config saved to /var/cache/conftool/dbconfig/20251216-140008-marostegui.json
  • 13:56 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2028.codfw.wmnet']
  • 13:55 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
  • 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['es2028.codfw.wmnet']
  • 13:44 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
  • 13:43 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2028.codfw.wmnet']
  • 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['es2028.codfw.wmnet']
  • 13:36 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2028.codfw.wmnet']
  • 13:35 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es2028.codfw.wmnet']
  • 13:30 Emperor: enable puppet on O:swift::proxy
  • 13:29 Emperor: repool ms-fe1010
  • 13:24 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2028.codfw.wmnet']
  • 13:14 Emperor: depool ms-fe1010 for testing
  • 13:06 Emperor: disable puppet on O:swift::proxy
  • 13:01 godog: fix network configuration and reboot cloudcephosd1052 - T399180
  • 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts puppetmaster1003.eqiad.wmnet
  • 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:52 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
  • 12:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
  • 12:45 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 12:38 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts puppetmaster1003.eqiad.wmnet
  • 12:38 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Follow-up: SI: Add "past checks" link next to accounts in table pager (T411268) (duration: 10m 47s)
  • 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
  • 12:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2028.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 12:32 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 12:31 dreamyjazz@deploy2002: dreamyjazz: Backport for Follow-up: SI: Add "past checks" link next to accounts in table pager (T411268) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 12:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 12:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 12:27 dreamyjazz@deploy2002: Started scap sync-world: Backport for Follow-up: SI: Add "past checks" link next to accounts in table pager (T411268)
  • 12:27 marostegui@cumin1003: START - Cookbook sre.hosts.provision for host es2028.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 12:15 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es2028.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 12:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 12:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 12:08 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Remove definition of wgGlobalBlockingEnableAutoblocks (T379086), Show global autoblocks in the globalblocks list API response (T379087) (duration: 67m 55s)
  • 11:54 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 11:50 dreamyjazz@deploy2002: dreamyjazz: Backport for Remove definition of wgGlobalBlockingEnableAutoblocks (T379086), Show global autoblocks in the globalblocks list API response (T379087) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 11:40 marostegui@cumin1003: START - Cookbook sre.hosts.provision for host es2028.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 11:39 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS trixie
  • 11:22 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2028.codfw.wmnet with OS trixie
  • 11:19 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 11:15 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 11:04 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:fixLinkRecommendationData --wiki=itwiki --dry-run --search-index --db-table # T412040-fix-dryrun-02
  • 11:00 dreamyjazz@deploy2002: Started scap sync-world: Backport for Remove definition of wgGlobalBlockingEnableAutoblocks (T379086), Show global autoblocks in the globalblocks list API response (T379087)
  • 10:58 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
  • 10:58 mwpresync@deploy2002: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.5,1.46.0-wmf.7,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/m
  • 10:57 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1029.eqiad.wmnet with OS trixie
  • 10:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
  • 10:45 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2028.codfw.wmnet with OS trixie
  • 10:44 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
  • 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
  • 10:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2028.codfw.wmnet with reason: reimage
  • 10:05 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1029.eqiad.wmnet with OS trixie
  • 10:05 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.7 refs T408277
  • 10:04 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
  • 10:03 hashar: Started MediaWiki train task `train-presync`. It did not run overnight due to a CI failure | T408277
  • 09:58 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
  • 09:54 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
  • 09:46 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
  • 09:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 09:40 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 09:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2209 (T410589)', diff saved to https://phabricator.wikimedia.org/P86654 and previous config saved to /var/cache/conftool/dbconfig/20251216-093745-ladsgroup.json
  • 09:37 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 09:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T410589)', diff saved to https://phabricator.wikimedia.org/P86653 and previous config saved to /var/cache/conftool/dbconfig/20251216-093720-ladsgroup.json
  • 09:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P86652 and previous config saved to /var/cache/conftool/dbconfig/20251216-092212-ladsgroup.json
  • 09:22 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
  • 09:21 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add primary IP to ps1-e10-eqiad - ayounsi@cumin1003"
  • 09:20 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add primary IP to ps1-e10-eqiad - ayounsi@cumin1003"
  • 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts puppetmaster2002.codfw.wmnet
  • 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 09:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P86651 and previous config saved to /var/cache/conftool/dbconfig/20251216-090704-ladsgroup.json
  • 09:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 09:01 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 08:55 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts puppetmaster2002.codfw.wmnet
  • 08:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T410589)', diff saved to https://phabricator.wikimedia.org/P86650 and previous config saved to /var/cache/conftool/dbconfig/20251216-085155-ladsgroup.json
  • 08:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86649 and previous config saved to /var/cache/conftool/dbconfig/20251216-084817-marostegui.json
  • 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 08:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86648 and previous config saved to /var/cache/conftool/dbconfig/20251216-084752-marostegui.json
  • 08:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P86647 and previous config saved to /var/cache/conftool/dbconfig/20251216-083243-marostegui.json
  • 08:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P86646 and previous config saved to /var/cache/conftool/dbconfig/20251216-081735-marostegui.json
  • 08:11 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS bookworm
  • 08:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86645 and previous config saved to /var/cache/conftool/dbconfig/20251216-080227-marostegui.json
  • 07:37 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 07:36 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1181 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86644 and previous config saved to /var/cache/conftool/dbconfig/20251216-072114-marostegui.json
  • 07:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 07:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86643 and previous config saved to /var/cache/conftool/dbconfig/20251216-072049-marostegui.json
  • 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P86642 and previous config saved to /var/cache/conftool/dbconfig/20251216-070542-marostegui.json
  • 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P86641 and previous config saved to /var/cache/conftool/dbconfig/20251216-065033-marostegui.json
  • 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86640 and previous config saved to /var/cache/conftool/dbconfig/20251216-063525-marostegui.json
  • 05:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86639 and previous config saved to /var/cache/conftool/dbconfig/20251216-051607-marostegui.json
  • 05:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 02:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86638 and previous config saved to /var/cache/conftool/dbconfig/20251216-025200-marostegui.json
  • 02:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 02:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86637 and previous config saved to /var/cache/conftool/dbconfig/20251216-025136-marostegui.json
  • 02:50 ladsgroup@deploy2002: Finished scap sync-world: Backport for SpecialLinkSearch: Add a message when domains are being ignored (T405005) (duration: 38m 47s)
  • 02:37 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 02:36 ladsgroup@deploy2002: ladsgroup: Backport for SpecialLinkSearch: Add a message when domains are being ignored (T405005) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 02:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P86636 and previous config saved to /var/cache/conftool/dbconfig/20251216-023627-marostegui.json
  • 02:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P86635 and previous config saved to /var/cache/conftool/dbconfig/20251216-022119-marostegui.json
  • 02:11 ladsgroup@deploy2002: Started scap sync-world: Backport for SpecialLinkSearch: Add a message when domains are being ignored (T405005)
  • 02:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86634 and previous config saved to /var/cache/conftool/dbconfig/20251216-020611-marostegui.json
  • 01:01 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 01m 15s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 00:53 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 00:52 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 00:50 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 00:50 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 00:49 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 00:48 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 00:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 00:45 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 00:44 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 00:43 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 00:42 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 00:41 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply

2025-12-15

2025-12-14

  • 21:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T410589)', diff saved to https://phabricator.wikimedia.org/P86602 and previous config saved to /var/cache/conftool/dbconfig/20251214-212240-ladsgroup.json
  • 21:22 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 21:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T410589)', diff saved to https://phabricator.wikimedia.org/P86601 and previous config saved to /var/cache/conftool/dbconfig/20251214-212226-ladsgroup.json
  • 21:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P86600 and previous config saved to /var/cache/conftool/dbconfig/20251214-210717-ladsgroup.json
  • 20:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P86599 and previous config saved to /var/cache/conftool/dbconfig/20251214-205208-ladsgroup.json
  • 20:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T410589)', diff saved to https://phabricator.wikimedia.org/P86598 and previous config saved to /var/cache/conftool/dbconfig/20251214-203700-ladsgroup.json
  • 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1259 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86597 and previous config saved to /var/cache/conftool/dbconfig/20251214-201213-marostegui.json
  • 20:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
  • 20:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86596 and previous config saved to /var/cache/conftool/dbconfig/20251214-201148-marostegui.json
  • 19:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P86595 and previous config saved to /var/cache/conftool/dbconfig/20251214-195640-marostegui.json
  • 19:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P86594 and previous config saved to /var/cache/conftool/dbconfig/20251214-194132-marostegui.json
  • 19:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86593 and previous config saved to /var/cache/conftool/dbconfig/20251214-192623-marostegui.json
  • 14:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86592 and previous config saved to /var/cache/conftool/dbconfig/20251214-145800-marostegui.json
  • 14:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P86591 and previous config saved to /var/cache/conftool/dbconfig/20251214-144251-marostegui.json
  • 14:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P86590 and previous config saved to /var/cache/conftool/dbconfig/20251214-142743-marostegui.json
  • 14:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86589 and previous config saved to /var/cache/conftool/dbconfig/20251214-141235-marostegui.json
  • 13:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1254 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86588 and previous config saved to /var/cache/conftool/dbconfig/20251214-132817-marostegui.json
  • 13:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
  • 08:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2238 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86587 and previous config saved to /var/cache/conftool/dbconfig/20251214-083116-marostegui.json
  • 08:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2238.codfw.wmnet with reason: Maintenance
  • 08:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86586 and previous config saved to /var/cache/conftool/dbconfig/20251214-083051-marostegui.json
  • 08:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P86585 and previous config saved to /var/cache/conftool/dbconfig/20251214-081543-marostegui.json
  • 08:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P86584 and previous config saved to /var/cache/conftool/dbconfig/20251214-080034-marostegui.json
  • 07:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86583 and previous config saved to /var/cache/conftool/dbconfig/20251214-074526-marostegui.json
  • 07:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 07:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86582 and previous config saved to /var/cache/conftool/dbconfig/20251214-073957-marostegui.json
  • 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P86581 and previous config saved to /var/cache/conftool/dbconfig/20251214-072449-marostegui.json
  • 07:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P86580 and previous config saved to /var/cache/conftool/dbconfig/20251214-070940-marostegui.json
  • 06:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86579 and previous config saved to /var/cache/conftool/dbconfig/20251214-065432-marostegui.json
  • 06:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2226 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86578 and previous config saved to /var/cache/conftool/dbconfig/20251214-062752-marostegui.json
  • 06:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2226.codfw.wmnet with reason: Maintenance
  • 06:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86577 and previous config saved to /var/cache/conftool/dbconfig/20251214-062727-marostegui.json
  • 06:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P86576 and previous config saved to /var/cache/conftool/dbconfig/20251214-061219-marostegui.json
  • 05:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P86575 and previous config saved to /var/cache/conftool/dbconfig/20251214-055711-marostegui.json
  • 05:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86574 and previous config saved to /var/cache/conftool/dbconfig/20251214-054202-marostegui.json
  • 03:20 eileen: civicrm upgraded from 8a0822ef to 03479639
  • 01:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T410589)', diff saved to https://phabricator.wikimedia.org/P86573 and previous config saved to /var/cache/conftool/dbconfig/20251214-015920-ladsgroup.json
  • 01:59 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 01:58 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T410589)', diff saved to https://phabricator.wikimedia.org/P86572 and previous config saved to /var/cache/conftool/dbconfig/20251214-015856-ladsgroup.json
  • 01:43 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P86571 and previous config saved to /var/cache/conftool/dbconfig/20251214-014348-ladsgroup.json
  • 01:30 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 29m 42s)
  • 01:28 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P86570 and previous config saved to /var/cache/conftool/dbconfig/20251214-012839-ladsgroup.json
  • 01:13 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T410589)', diff saved to https://phabricator.wikimedia.org/P86569 and previous config saved to /var/cache/conftool/dbconfig/20251214-011331-ladsgroup.json
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 00:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1233 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86568 and previous config saved to /var/cache/conftool/dbconfig/20251214-005607-marostegui.json
  • 00:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 00:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86567 and previous config saved to /var/cache/conftool/dbconfig/20251214-005542-marostegui.json
  • 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P86566 and previous config saved to /var/cache/conftool/dbconfig/20251214-004034-marostegui.json
  • 00:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P86565 and previous config saved to /var/cache/conftool/dbconfig/20251214-002526-marostegui.json
  • 00:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86564 and previous config saved to /var/cache/conftool/dbconfig/20251214-001017-marostegui.json

2025-12-13

  • 23:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2225 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86563 and previous config saved to /var/cache/conftool/dbconfig/20251213-235427-marostegui.json
  • 23:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2225.codfw.wmnet with reason: Maintenance
  • 23:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86562 and previous config saved to /var/cache/conftool/dbconfig/20251213-235413-marostegui.json
  • 23:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P86561 and previous config saved to /var/cache/conftool/dbconfig/20251213-233905-marostegui.json
  • 23:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P86560 and previous config saved to /var/cache/conftool/dbconfig/20251213-232356-marostegui.json
  • 23:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86559 and previous config saved to /var/cache/conftool/dbconfig/20251213-230848-marostegui.json
  • 17:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1229 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86558 and previous config saved to /var/cache/conftool/dbconfig/20251213-175442-marostegui.json
  • 17:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2207 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86557 and previous config saved to /var/cache/conftool/dbconfig/20251213-171057-marostegui.json
  • 17:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance
  • 12:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 12:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86556 and previous config saved to /var/cache/conftool/dbconfig/20251213-120425-marostegui.json
  • 11:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P86555 and previous config saved to /var/cache/conftool/dbconfig/20251213-114916-marostegui.json
  • 11:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P86554 and previous config saved to /var/cache/conftool/dbconfig/20251213-113408-marostegui.json
  • 11:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86553 and previous config saved to /var/cache/conftool/dbconfig/20251213-112229-marostegui.json
  • 11:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86552 and previous config saved to /var/cache/conftool/dbconfig/20251213-111900-marostegui.json
  • 11:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P86551 and previous config saved to /var/cache/conftool/dbconfig/20251213-110720-marostegui.json
  • 10:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P86550 and previous config saved to /var/cache/conftool/dbconfig/20251213-105212-marostegui.json
  • 10:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86549 and previous config saved to /var/cache/conftool/dbconfig/20251213-103704-marostegui.json
  • 09:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1197 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86548 and previous config saved to /var/cache/conftool/dbconfig/20251213-094944-marostegui.json
  • 09:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 09:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86547 and previous config saved to /var/cache/conftool/dbconfig/20251213-094920-marostegui.json
  • 09:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P86546 and previous config saved to /var/cache/conftool/dbconfig/20251213-093412-marostegui.json
  • 09:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P86545 and previous config saved to /var/cache/conftool/dbconfig/20251213-091903-marostegui.json
  • 09:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86544 and previous config saved to /var/cache/conftool/dbconfig/20251213-090355-marostegui.json
  • 07:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1188 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86543 and previous config saved to /var/cache/conftool/dbconfig/20251213-073445-marostegui.json
  • 07:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 07:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86542 and previous config saved to /var/cache/conftool/dbconfig/20251213-073421-marostegui.json
  • 07:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P86541 and previous config saved to /var/cache/conftool/dbconfig/20251213-071913-marostegui.json
  • 07:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P86540 and previous config saved to /var/cache/conftool/dbconfig/20251213-070405-marostegui.json
  • 06:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86539 and previous config saved to /var/cache/conftool/dbconfig/20251213-064856-marostegui.json
  • 06:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T410589)', diff saved to https://phabricator.wikimedia.org/P86538 and previous config saved to /var/cache/conftool/dbconfig/20251213-063023-ladsgroup.json
  • 06:30 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 06:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T410589)', diff saved to https://phabricator.wikimedia.org/P86537 and previous config saved to /var/cache/conftool/dbconfig/20251213-062958-ladsgroup.json
  • 06:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P86536 and previous config saved to /var/cache/conftool/dbconfig/20251213-061450-ladsgroup.json
  • 05:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P86535 and previous config saved to /var/cache/conftool/dbconfig/20251213-055942-ladsgroup.json
  • 05:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T410589)', diff saved to https://phabricator.wikimedia.org/P86534 and previous config saved to /var/cache/conftool/dbconfig/20251213-054433-ladsgroup.json
  • 05:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2189 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86533 and previous config saved to /var/cache/conftool/dbconfig/20251213-050223-marostegui.json
  • 05:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 05:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86532 and previous config saved to /var/cache/conftool/dbconfig/20251213-050158-marostegui.json
  • 04:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P86531 and previous config saved to /var/cache/conftool/dbconfig/20251213-044649-marostegui.json
  • 04:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P86530 and previous config saved to /var/cache/conftool/dbconfig/20251213-043141-marostegui.json
  • 04:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86529 and previous config saved to /var/cache/conftool/dbconfig/20251213-041633-marostegui.json
  • 01:18 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 17m 42s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-12

  • 23:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1182 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86528 and previous config saved to /var/cache/conftool/dbconfig/20251212-233453-marostegui.json
  • 23:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 23:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86527 and previous config saved to /var/cache/conftool/dbconfig/20251212-233428-marostegui.json
  • 23:22 tzatziki: removing 1 file for legal compliance
  • 23:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P86526 and previous config saved to /var/cache/conftool/dbconfig/20251212-231920-marostegui.json
  • 23:16 tzatziki: removing 4 files for legal compliance
  • 23:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P86525 and previous config saved to /var/cache/conftool/dbconfig/20251212-230412-marostegui.json
  • 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86524 and previous config saved to /var/cache/conftool/dbconfig/20251212-224903-marostegui.json
  • 21:32 tzatziki: removing 4 files for legal compliance
  • 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2175 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86523 and previous config saved to /var/cache/conftool/dbconfig/20251212-212305-marostegui.json
  • 21:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 21:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86522 and previous config saved to /var/cache/conftool/dbconfig/20251212-212240-marostegui.json
  • 21:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1162 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86521 and previous config saved to /var/cache/conftool/dbconfig/20251212-211309-marostegui.json
  • 21:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 21:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86520 and previous config saved to /var/cache/conftool/dbconfig/20251212-211245-marostegui.json
  • 21:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P86519 and previous config saved to /var/cache/conftool/dbconfig/20251212-210731-marostegui.json
  • 20:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P86518 and previous config saved to /var/cache/conftool/dbconfig/20251212-205737-marostegui.json
  • 20:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P86517 and previous config saved to /var/cache/conftool/dbconfig/20251212-205223-marostegui.json
  • 20:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P86516 and previous config saved to /var/cache/conftool/dbconfig/20251212-204228-marostegui.json
  • 20:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86515 and previous config saved to /var/cache/conftool/dbconfig/20251212-203715-marostegui.json
  • 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86514 and previous config saved to /var/cache/conftool/dbconfig/20251212-202720-marostegui.json
  • 19:35 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet with OS bookworm
  • 19:35 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 17:48 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 17:30 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker2005.codfw.wmnet with reason: host reimage
  • 17:26 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker2005.codfw.wmnet with reason: host reimage
  • 17:16 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker2005.codfw.wmnet with OS bookworm
  • 16:54 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 16:54 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 15:55 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 15:55 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 15:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 15:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 14:51 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-jumbo1002.eqiad.wmnet with OS trixie
  • 14:51 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 14:51 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-jumbo1003.eqiad.wmnet with OS trixie
  • 14:51 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 14:34 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 14:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 14:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti-jumbo1003.eqiad.wmnet with reason: host reimage
  • 14:13 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti-jumbo1003.eqiad.wmnet with reason: host reimage
  • 14:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti-jumbo1002.eqiad.wmnet with reason: host reimage
  • 14:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo1003.eqiad.wmnet with OS trixie
  • 14:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:00 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti-jumbo1002.eqiad.wmnet with reason: host reimage
  • 13:50 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 13:50 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo1002.eqiad.wmnet with OS trixie
  • 13:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
  • 13:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2148 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86512 and previous config saved to /var/cache/conftool/dbconfig/20251212-134125-marostegui.json
  • 13:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 13:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 13:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 13:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
  • 13:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
  • 13:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
  • 13:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
  • 13:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
  • 13:22 gehel@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS trixie
  • 13:22 gehel@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
  • 13:14 jgleeson: payments-wiki upgraded from 99671dda to fc18b3c0
  • 10:47 gehel@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1032.eqiad.wmnet with OS trixie
  • 10:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2149 (T410589)', diff saved to https://phabricator.wikimedia.org/P86511 and previous config saved to /var/cache/conftool/dbconfig/20251212-103907-ladsgroup.json
  • 10:38 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 10:16 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
  • 10:16 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
  • 10:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
  • 10:10 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
  • 09:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 09:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 09:19 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 09:19 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 09:14 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
  • 09:09 gehel@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
  • 09:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 09:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 08:55 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 8560
  • 08:54 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 8560
  • 08:51 gehel@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS trixie
  • 08:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 14537
  • 08:36 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 14537
  • 08:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12709
  • 08:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12709
  • 02:20 ejegg: donorwiki upgraded from bbd96c00 to 99671dda
  • 02:08 ejegg: payments-wiki upgraded from 460c2f5d to 99671dda
  • 01:30 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 29m 34s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 00:36 larssandergreen: Updating civicrm from 1a5626c4 to 8a0822ef

2025-12-11

  • 23:27 kemayo@deploy2002: Finished scap sync-world: Backport for Add product_metrics.contributors.experiments to wgMetricsPlatformExperimentStreamNames (T405177 T410803) (duration: 07m 38s)
  • 23:25 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker2005.codfw.wmnet with OS bookworm
  • 23:23 kemayo@deploy2002: kemayo: Continuing with sync
  • 23:22 kemayo@deploy2002: kemayo: Backport for Add product_metrics.contributors.experiments to wgMetricsPlatformExperimentStreamNames (T405177 T410803) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:20 kemayo@deploy2002: Started scap sync-world: Backport for Add product_metrics.contributors.experiments to wgMetricsPlatformExperimentStreamNames (T405177 T410803)
  • 22:40 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet with OS bookworm
  • 22:40 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 22:39 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 22:30 maryum: Deployed security fix for T411305
  • 22:19 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker2004.codfw.wmnet with reason: host reimage
  • 22:16 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker2004.codfw.wmnet with reason: host reimage
  • 22:05 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker2005.codfw.wmnet with OS bookworm
  • 22:05 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker2004.codfw.wmnet with OS bookworm
  • 21:54 dani@deploy2002: Finished scap sync-world: Backport for Undeploy 2025 Global Readers Survey (T410918), Test Kitchen: StickyHeaders experiment hotfix (T412146) (duration: 09m 07s)
  • 21:52 gehel@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1031.eqiad.wmnet with OS trixie
  • 21:49 dani@deploy2002: dani, cjming: Continuing with sync
  • 21:47 dani@deploy2002: dani, cjming: Backport for Undeploy 2025 Global Readers Survey (T410918), Test Kitchen: StickyHeaders experiment hotfix (T412146) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:45 dani@deploy2002: Started scap sync-world: Backport for Undeploy 2025 Global Readers Survey (T410918), Test Kitchen: StickyHeaders experiment hotfix (T412146)
  • 21:35 urbanecm@deploy2002: Finished scap sync-world: Backport for Revert^3 "Confirmation email: further styling adjustments" (T411526), Revert^3 "i18n: replace <> to avoid false positive export errors" (T411526) (duration: 46m 04s)
  • 21:22 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 21:22 urbanecm@deploy2002: urbanecm: Backport for Revert^3 "Confirmation email: further styling adjustments" (T411526), Revert^3 "i18n: replace <> to avoid false positive export errors" (T411526) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:17 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 21:17 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 21:14 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti-jumbo1002.eqiad.wmnet with OS trixie
  • 21:10 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 21:09 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 20:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:56 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet with OS trixie
  • 20:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 20:49 urbanecm@deploy2002: Started scap sync-world: Backport for Revert^3 "Confirmation email: further styling adjustments" (T411526), Revert^3 "i18n: replace <> to avoid false positive export errors" (T411526)
  • 20:40 urbanecm@deploy2002: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.4,1.46.0-wmf.5,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/me
  • 20:38 brett@dns1006: END - running authdns-update
  • 20:37 brett@dns1006: START - running authdns-update
  • 20:34 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
  • 20:28 gehel@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
  • 20:25 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 20:10 gehel@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1031.eqiad.wmnet with OS trixie
  • 20:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti-jumbo1001.eqiad.wmnet with reason: host reimage
  • 20:02 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti-jumbo1001.eqiad.wmnet with reason: host reimage
  • 19:54 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo1002.eqiad.wmnet with OS trixie
  • 19:51 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo1001.eqiad.wmnet with OS trixie
  • 19:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti-jumbo1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 19:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 19:47 urbanecm@deploy2002: Started scap sync-world: Backport for Revert^2 "Confirmation email: further styling adjustments" (T411526), Revert^2 "i18n: replace <> to avoid false positive export errors" (T411526)
  • 19:46 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] Enable Add Link backend on a handful of wikis (T410469) (duration: 08m 55s)
  • 19:42 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 19:41 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 19:40 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 19:40 urbanecm@deploy2002: urbanecm: Backport for [Growth] Enable Add Link backend on a handful of wikis (T410469) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 19:37 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] Enable Add Link backend on a handful of wikis (T410469)
  • 19:29 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 19:28 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 19:27 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker2005
  • 19:27 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker2004
  • 19:27 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker2005
  • 19:27 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker2004
  • 19:27 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:27 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-worker2004-5 to codfw - jhancock@cumin1003"
  • 19:26 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-worker2004-5 to codfw - jhancock@cumin1003"
  • 19:23 jhancock@cumin1003: START - Cookbook sre.dns.netbox
  • 19:15 Krinkle: krinkle@deploy1002 sql --write wikifunctionswiki `UPDATE page SET page_touched='20251211191600' WHERE page_id=66102 LIMIT 1;`
  • 18:51 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1217347 T410975 (duration: 04m 54s)
  • 18:50 rzl@deploy2002: rzl: Continuing with sync
  • 18:48 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1217347 T410975 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:47 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1217347 T410975
  • 17:37 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 17:37 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 17:20 gehel@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1030.eqiad.wmnet with OS trixie
  • 17:17 gehel@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS trixie
  • 16:23 jhathaway: upload new package of corto via reprepro
  • 15:47 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
  • 15:43 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 15:43 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 15:43 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 15:41 gehel@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
  • 15:39 gehel@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
  • 15:24 gehel@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1030.eqiad.wmnet with OS trixie
  • 15:21 gehel@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
  • 15:13 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1028.eqiad.wmnet with OS trixie
  • 15:02 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:01 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:54 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1028.eqiad.wmnet with reason: host reimage
  • 14:50 gehel@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1028.eqiad.wmnet with reason: host reimage
  • 14:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:43 jclark@cumin1003: START - Cookbook sre.dns.netbox
  • 14:37 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 14:37 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 14:34 kemayo@deploy2002: Finished scap sync-world: Backport for Localisation updates from https://translatewiki.net. (duration: 10m 44s)
  • 14:33 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
  • 14:33 gehel@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1028.eqiad.wmnet with OS trixie
  • 14:30 kemayo@deploy2002: kemayo: Continuing with sync
  • 14:28 kemayo@deploy2002: kemayo: Backport for Localisation updates from https://translatewiki.net. synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:23 kemayo@deploy2002: Started scap sync-world: Backport for Localisation updates from https://translatewiki.net.
  • 14:21 aude@deploy2002: Finished scap sync-world: Backport for [Legal Footer] Deploy Legal Footer for Phase 1 wikis (T410164) (duration: 11m 32s)
  • 14:16 aude@deploy2002: lmora, aude: Continuing with sync
  • 14:11 aude@deploy2002: lmora, aude: Backport for [Legal Footer] Deploy Legal Footer for Phase 1 wikis (T410164) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:09 aude@deploy2002: Started scap sync-world: Backport for [Legal Footer] Deploy Legal Footer for Phase 1 wikis (T410164)
  • 14:08 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-sd1007.eqiad.wmnet with OS bookworm
  • 14:08 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 14:07 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 14:07 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 14:07 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 14:05 mforns@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
  • 14:05 mforns@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
  • 14:05 mforns@deploy2002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
  • 14:05 mforns@deploy2002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
  • 14:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-sd1006.eqiad.wmnet with OS bookworm
  • 14:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 14:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 14:03 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-sd1005.eqiad.wmnet with OS bookworm
  • 14:03 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 14:00 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 13:58 mforns@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
  • 13:58 mforns@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
  • 13:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-sd1007.eqiad.wmnet with reason: host reimage
  • 13:50 XioNoX: restart gnmic on netflow1002
  • 13:47 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-sd1006.eqiad.wmnet with reason: host reimage
  • 13:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-sd1005.eqiad.wmnet with reason: host reimage
  • 13:43 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-sd1006.eqiad.wmnet with reason: host reimage
  • 13:43 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-sd1007.eqiad.wmnet with reason: host reimage
  • 13:40 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-sd1005.eqiad.wmnet with reason: host reimage
  • 13:20 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
  • 13:17 krinkle@deploy2002: Finished deploy [performance/navtiming@dde77b9]: Add temporary group for parsoid readviews (duration: 00m 16s)
  • 13:17 krinkle@deploy2002: Started deploy [performance/navtiming@dde77b9]: Add temporary group for parsoid readviews
  • 13:00 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host logging-sd1005.eqiad.wmnet with OS bookworm
  • 13:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aqs1025.eqiad.wmnet with OS bullseye
  • 13:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 12:59 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 12:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:54 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host logging-sd1006.eqiad.wmnet with OS bookworm
  • 12:54 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host logging-sd1007.eqiad.wmnet with OS bookworm
  • 12:53 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 12:53 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 12:52 jelto@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
  • 12:52 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
  • 12:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:48 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-sd1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-sd1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1025.eqiad.wmnet with reason: host reimage
  • 12:40 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1025.eqiad.wmnet with reason: host reimage
  • 12:36 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:34 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:33 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:30 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host aqs1025.eqiad.wmnet with OS bullseye
  • 12:26 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aqs1025.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:18 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1025.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:17 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1025.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:16 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1025.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 12:06 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
  • 11:57 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
  • 11:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
  • 11:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
  • 11:43 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
  • 11:34 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
  • 11:05 jelto@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
  • 11:01 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
  • 10:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
  • 10:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
  • 10:37 jelto@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
  • 10:37 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
  • 10:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 10:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 10:00 topranks: revert esams transport load balancing
  • 09:41 XioNoX: revert eqsin transport load balancing
  • 09:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
  • 09:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
  • 03:21 eileen: civicrm upgraded from 41a460d5 to 1a5626c4
  • 03:11 ejegg: fundraising python tools upgraded from 8e900e85 to c75f7625
  • 01:23 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 22m 36s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 00:30 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:29 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 00:16 rzl: rzl@apt1002:~$ sudo -i reprepro copy trixie-wikimedia bullseye-wikimedia envoyproxy # T410975
  • 00:16 rzl: rzl@apt1002:~$ sudo -i reprepro copy bookworm-wikimedia bullseye-wikimedia envoyproxy # T410975
  • 00:16 rzl: rzl@apt1002:~$ sudo -i reprepro -C main includedeb bullseye-wikimedia /srv/wikimedia/pool/component/envoy-future/e/envoyproxy/envoyproxy_1.35.7-1_amd64.deb # T410975

2025-12-10

  • 23:51 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 23:50 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 23:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:47 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 23:47 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 23:46 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:44 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:44 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:44 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 23:41 rzl: rzl@deploy2002:/srv/deployment-charts/helmfile.d/services/mw-debug$ helmfile -e codfw -i apply -l name=pinkunicorn --context=5 # T410975
  • 23:40 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 23:40 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 23:36 rzl: rzl@deploy2002:/srv/deployment-charts/helmfile.d/services/mw-debug$ helmfile -e codfw -i apply -l name=pinkunicorn --set mesh.image_name=envoy-future --set mesh.image_version=1.35.7-1 --context=5 # T410975
  • 23:35 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 23:35 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 23:30 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
  • 23:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mathoid: apply
  • 23:28 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
  • 23:27 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
  • 23:08 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
  • 23:07 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mathoid: apply
  • 23:04 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
  • 23:04 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
  • 22:58 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 22:57 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 22:22 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-jumbo2001.codfw.wmnet with OS trixie
  • 22:22 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 22:19 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 22:15 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-jumbo2003.codfw.wmnet with OS trixie
  • 22:15 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 22:14 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 22:11 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-jumbo2002.codfw.wmnet with OS trixie
  • 22:11 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 22:10 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 22:02 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti-jumbo2001.codfw.wmnet with reason: host reimage
  • 21:58 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti-jumbo2003.codfw.wmnet with reason: host reimage
  • 21:54 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti-jumbo2002.codfw.wmnet with reason: host reimage
  • 21:50 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti-jumbo2003.codfw.wmnet with reason: host reimage
  • 21:49 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti-jumbo2001.codfw.wmnet with reason: host reimage
  • 21:49 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti-jumbo2002.codfw.wmnet with reason: host reimage
  • 21:49 kemayo@deploy2002: Finished scap sync-world: Backport for Add experiment + tracking for mobile section switching (T410803), mobileSectionSwitch: action_context needs to be stringified (T410803) (duration: 09m 40s)
  • 21:44 kemayo@deploy2002: kemayo: Continuing with sync
  • 21:42 kemayo@deploy2002: kemayo: Backport for Add experiment + tracking for mobile section switching (T410803), mobileSectionSwitch: action_context needs to be stringified (T410803) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:39 kemayo@deploy2002: Started scap sync-world: Backport for Add experiment + tracking for mobile section switching (T410803), mobileSectionSwitch: action_context needs to be stringified (T410803)
  • 21:38 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo2003.codfw.wmnet with OS trixie
  • 21:38 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo2002.codfw.wmnet with OS trixie
  • 21:38 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo2001.codfw.wmnet with OS trixie
  • 21:36 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:35 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:33 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T410589)', diff saved to https://phabricator.wikimedia.org/P86509 and previous config saved to /var/cache/conftool/dbconfig/20251210-213235-ladsgroup.json
  • 21:30 jsn@deploy2002: Finished scap sync-world: Backport for Enable revertrisk filters in thwiki (T409438) (duration: 08m 51s)
  • 21:25 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:25 jsn@deploy2002: kgraessle, jsn: Continuing with sync
  • 21:25 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:24 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1014.eqiad.wmnet with reason: catching up on lag
  • 21:23 jsn@deploy2002: kgraessle, jsn: Backport for Enable revertrisk filters in thwiki (T409438) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:22 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 21:21 jsn@deploy2002: Started scap sync-world: Backport for Enable revertrisk filters in thwiki (T409438)
  • 21:20 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti-jumbo2003
  • 21:20 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti-jumbo2002
  • 21:20 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti-jumbo2001
  • 21:20 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti-jumbo2003
  • 21:20 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti-jumbo2002
  • 21:20 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti-jumbo2001
  • 21:20 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:20 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ganeti-jumbo2001-3 to codfw - jhancock@cumin1003"
  • 21:20 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ganeti-jumbo2001-3 to codfw - jhancock@cumin1003"
  • 21:19 sbassett@deploy2002: Finished scap sync-world: Backport for Set CSP Report Only mode for group1 wikis (T291867) (duration: 10m 34s)
  • 21:17 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P86508 and previous config saved to /var/cache/conftool/dbconfig/20251210-211728-ladsgroup.json
  • 21:16 jhancock@cumin1003: START - Cookbook sre.dns.netbox
  • 21:12 sbassett@deploy2002: sbassett: Continuing with sync
  • 21:12 sbassett@deploy2002: sbassett: Backport for Set CSP Report Only mode for group1 wikis (T291867) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:08 sbassett@deploy2002: Started scap sync-world: Backport for Set CSP Report Only mode for group1 wikis (T291867)
  • 21:06 larssandergreen: Updating civicrm from 764fa3a8 to 41a460d5
  • 21:02 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P86507 and previous config saved to /var/cache/conftool/dbconfig/20251210-210220-ladsgroup.json
  • 21:01 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aqs1026.eqiad.wmnet with OS bullseye
  • 21:01 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 21:01 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 21:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aqs1024.eqiad.wmnet with OS bullseye
  • 21:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 21:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aqs1027.eqiad.wmnet with OS bullseye
  • 21:00 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 20:57 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 20:52 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 20:49 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aqs1023.eqiad.wmnet with OS bullseye
  • 20:49 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 20:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
  • 20:47 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T410589)', diff saved to https://phabricator.wikimedia.org/P86506 and previous config saved to /var/cache/conftool/dbconfig/20251210-204712-ladsgroup.json
  • 20:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1026.eqiad.wmnet with reason: host reimage
  • 20:41 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1027.eqiad.wmnet with reason: host reimage
  • 20:37 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1024.eqiad.wmnet with reason: host reimage
  • 20:33 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1026.eqiad.wmnet with reason: host reimage
  • 20:33 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1027.eqiad.wmnet with reason: host reimage
  • 20:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1023.eqiad.wmnet with reason: host reimage
  • 20:33 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1024.eqiad.wmnet with reason: host reimage
  • 20:31 ryankemper: [WDQS] `ryankemper@wdqs1014:~$ sudo systemctl restart wdqs-blazegraph` to unstick deadlock
  • 20:30 urbanecm@deploy2002: Finished scap sync-world: test (duration: 76m 31s)
  • 20:29 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1023.eqiad.wmnet with reason: host reimage
  • 20:23 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host aqs1027.eqiad.wmnet with OS bullseye
  • 20:23 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host aqs1026.eqiad.wmnet with OS bullseye
  • 20:22 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host aqs1024.eqiad.wmnet with OS bullseye
  • 20:18 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host aqs1023.eqiad.wmnet with OS bullseye
  • 20:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aqs1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aqs1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aqs1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:13 topranks: Remove 2x40G LAGs between ssw1-d1-eqiad ssw1-d8-eqiad and asw2-c-eqiad asw2-d-eqiad
  • 20:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1018.eqiad.wmnet with OS bullseye
  • 19:53 eevans@deploy2002: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
  • 19:52 eevans@deploy2002: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
  • 19:52 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
  • 19:51 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
  • 19:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: host reimage
  • 19:44 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1018.eqiad.wmnet with reason: host reimage
  • 19:29 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1018.eqiad.wmnet with OS bullseye
  • 19:14 urbanecm@deploy2002: Started scap sync-world: test
  • 19:08 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: T411781
  • 19:06 brett: stop pybal/puppet on lvs1018 (T411781)
  • 19:03 topranks: disable BGP on cr1-eqiad and cr2-eqiad to lvs1018 to fail over to lvs1020 (T411781)
  • 19:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
  • 19:03 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
  • 18:38 urbanecm@deploy2002: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.4,1.46.0-wmf.5,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/me
  • 18:27 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-sd2005.codfw.wmnet with OS bookworm
  • 18:26 jhancock@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 18:26 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-sd2007.codfw.wmnet with OS bookworm
  • 18:26 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 18:06 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 17:48 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-sd2005.codfw.wmnet with reason: host reimage
  • 17:46 urbanecm@deploy2002: Started scap sync-world: Backport for Confirmation email: further styling adjustments (T411526), i18n: replace <> to avoid false positive export errors
  • 17:44 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-sd2005.codfw.wmnet with reason: host reimage
  • 17:44 urbanecm@deploy2002: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.4,1.46.0-wmf.5,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/me
  • 17:35 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 17:19 jgleeson: civicrm upgraded from 5a21fb9c to 764fa3a8
  • 17:18 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-sd2006.codfw.wmnet with OS bookworm
  • 17:18 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 17:17 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-sd2007.codfw.wmnet with reason: host reimage
  • 17:16 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 17:13 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-sd2007.codfw.wmnet with reason: host reimage
  • 17:05 larssandergreen: Updating civicrm from bdf84821 to 5a21fb9c
  • 17:03 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-build1001.eqiad.wmnet with OS trixie
  • 17:03 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-build1001.eqiad.wmnet with OS trixie
  • 17:02 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host logging-sd2007.codfw.wmnet with OS bookworm
  • 17:02 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host logging-sd2005.codfw.wmnet with OS bookworm
  • 16:59 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-sd2006.codfw.wmnet with reason: host reimage
  • 16:53 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-sd2006.codfw.wmnet with reason: host reimage
  • 16:49 urbanecm@deploy2002: Started scap sync-world: Backport for Confirmation email: further styling adjustments (T411526), i18n: replace <> to avoid false positive export errors
  • 16:48 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-build1001.eqiad.wmnet with reason: host reimage
  • 16:43 dpogorzelski@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-build1001.eqiad.wmnet with reason: host reimage
  • 16:42 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host logging-sd2006.codfw.wmnet with OS bookworm
  • 16:39 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-sd2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 16:36 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-sd2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 16:31 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-sd2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 16:30 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 16:27 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 16:27 bking@dns1004: END - running authdns-update
  • 16:26 bking@dns1004: START - running authdns-update
  • 16:26 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-build1001.eqiad.wmnet with OS trixie
  • 16:11 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 16:11 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2007
  • 16:10 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2006
  • 16:10 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2007
  • 16:10 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2006
  • 15:34 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Define config for v2 of suggested investigations instrument (T409260) (duration: 06m 47s)
  • 15:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aqs1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:30 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 15:29 dreamyjazz@deploy2002: dreamyjazz: Backport for Define config for v2 of suggested investigations instrument (T409260) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:27 dreamyjazz@deploy2002: Started scap sync-world: Backport for Define config for v2 of suggested investigations instrument (T409260)
  • 15:25 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:25 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aqs1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:25 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:25 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:24 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:24 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:23 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd1005
  • 15:23 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd1005
  • 15:23 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:22 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd1007
  • 15:22 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd1007
  • 15:21 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd1006
  • 15:21 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd1006
  • 15:20 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:18 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:15 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:15 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:14 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:14 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:13 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:13 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:10 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:10 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:09 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:08 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:08 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:08 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:07 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:06 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:06 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:06 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:59 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1025.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:56 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:56 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:56 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:56 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt aqs servers - jclark@cumin1003"
  • 14:56 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt aqs servers - jclark@cumin1003"
  • 14:56 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 14:56 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 14:52 jclark@cumin1003: START - Cookbook sre.dns.netbox
  • 14:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:51 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1025.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:51 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:51 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 14:49 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:49 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Set wgEnableWatchlistLabels for beta (T411836) (duration: 07m 21s)
  • 14:45 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, samtar: Continuing with sync
  • 14:44 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, samtar: Backport for Set wgEnableWatchlistLabels for beta (T411836) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:41 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Set wgEnableWatchlistLabels for beta (T411836)
  • 14:26 arlolra@deploy2002: Finished scap sync-world: Backport for ExtensionDistributor: mark 1.45 as stable (T408482) (duration: 06m 29s)
  • 14:22 arlolra@deploy2002: arlolra, macfan4000: Continuing with sync
  • 14:22 arlolra@deploy2002: arlolra, macfan4000: Backport for ExtensionDistributor: mark 1.45 as stable (T408482) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:20 arlolra@deploy2002: Started scap sync-world: Backport for ExtensionDistributor: mark 1.45 as stable (T408482)
  • 14:14 sbisson@deploy2002: Finished scap sync-world: Backport for CX3 Build 1.0.0+20251209 (T384485 T408845 T409332 T409337 T409338 T411779) (duration: 09m 01s)
  • 14:10 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2228 (T410589)', diff saved to https://phabricator.wikimedia.org/P86501 and previous config saved to /var/cache/conftool/dbconfig/20251210-141046-ladsgroup.json
  • 14:10 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
  • 14:10 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T410589)', diff saved to https://phabricator.wikimedia.org/P86500 and previous config saved to /var/cache/conftool/dbconfig/20251210-141022-ladsgroup.json
  • 14:08 sbisson@deploy2002: sbisson: Continuing with sync
  • 14:07 sbisson@deploy2002: sbisson: Backport for CX3 Build 1.0.0+20251209 (T384485 T408845 T409332 T409337 T409338 T411779) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:05 sbisson@deploy2002: Started scap sync-world: Backport for CX3 Build 1.0.0+20251209 (T384485 T408845 T409332 T409337 T409338 T411779)
  • 13:55 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P86499 and previous config saved to /var/cache/conftool/dbconfig/20251210-135514-ladsgroup.json
  • 13:53 kart_: Updated Recommendation API to 2025-12-09-164214-production (T384485, T409338, T409332)
  • 13:51 kartik@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:47 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:41 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:40 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P86497 and previous config saved to /var/cache/conftool/dbconfig/20251210-134007-ladsgroup.json
  • 13:27 hnowlan@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
  • 13:25 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T410589)', diff saved to https://phabricator.wikimedia.org/P86496 and previous config saved to /var/cache/conftool/dbconfig/20251210-132459-ladsgroup.json
  • 13:20 hnowlan@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
  • 12:53 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 12:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 11:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-analytics-test: apply
  • 11:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-analytics-test: apply
  • 11:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet
  • 11:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1017.eqiad.wmnet
  • 10:39 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-build1001.eqiad.wmnet with reason: host reimage
  • 10:35 dpogorzelski@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-build1001.eqiad.wmnet with reason: host reimage
  • 10:19 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-build1001.eqiad.wmnet with OS trixie
  • 10:14 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from ml-lab1001 to ml-build1001
  • 10:13 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ml-build1001
  • 10:11 jelto@puppetserver1001: conftool action : set/pooled=no; selector: cluster=tcp-proxy,service=gerrit
  • 10:11 dpogorzelski@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ml-build1001
  • 10:11 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ml-build1001 on all recursors
  • 10:11 dpogorzelski@cumin1003: START - Cookbook sre.dns.wipe-cache ml-build1001 on all recursors
  • 10:11 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:11 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming ml-lab1001 to ml-build1001 - dpogorzelski@cumin1003"
  • 10:10 dpogorzelski@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming ml-lab1001 to ml-build1001 - dpogorzelski@cumin1003"
  • 10:04 dpogorzelski@cumin1003: START - Cookbook sre.dns.netbox
  • 10:04 dpogorzelski@cumin1003: START - Cookbook sre.hosts.rename from ml-lab1001 to ml-build1001
  • 10:01 jelto@puppetserver1001: conftool action : set/pooled=no; selector: cluster=tcp-proxy,service=gerrit,dc=drmrs
  • 09:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:47 jelto@puppetserver1001: conftool action : set/pooled=no; selector: name=tcp-proxy6001.drmrs.wmnet
  • 09:15 joal@deploy2002: Finished deploy [analytics/refinery@6e8f9d4] (thin): Regular analytics train THIN [analytics/refinery@6e8f9d4a] (duration: 01m 13s)
  • 09:14 joal@deploy2002: Started deploy [analytics/refinery@6e8f9d4] (thin): Regular analytics train THIN [analytics/refinery@6e8f9d4a]
  • 09:14 joal@deploy2002: Finished deploy [analytics/refinery@6e8f9d4]: Regular analytics train [analytics/refinery@6e8f9d4a] (duration: 02m 30s)
  • 09:11 joal@deploy2002: Started deploy [analytics/refinery@6e8f9d4]: Regular analytics train [analytics/refinery@6e8f9d4a]
  • 09:11 joal@deploy2002: Finished deploy [analytics/refinery@6e8f9d4] (hadoop-test): Regular analytics train TEST [analytics/refinery@6e8f9d4a] (duration: 01m 04s)
  • 09:10 joal@deploy2002: Started deploy [analytics/refinery@6e8f9d4] (hadoop-test): Regular analytics train TEST [analytics/refinery@6e8f9d4a]
  • 05:56 dpogorzelski@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-lab1001.eqiad.wmnet with OS trixie
  • 05:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2223 (T410589)', diff saved to https://phabricator.wikimedia.org/P86492 and previous config saved to /var/cache/conftool/dbconfig/20251210-055138-ladsgroup.json
  • 05:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
  • 05:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T410589)', diff saved to https://phabricator.wikimedia.org/P86491 and previous config saved to /var/cache/conftool/dbconfig/20251210-055125-ladsgroup.json
  • 05:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P86490 and previous config saved to /var/cache/conftool/dbconfig/20251210-053618-ladsgroup.json
  • 05:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P86489 and previous config saved to /var/cache/conftool/dbconfig/20251210-052110-ladsgroup.json
  • 05:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T410589)', diff saved to https://phabricator.wikimedia.org/P86488 and previous config saved to /var/cache/conftool/dbconfig/20251210-050603-ladsgroup.json
  • 01:57 cstone: SmashPig upgraded from 1442d0a0 to 5c731f99
  • 01:18 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 17m 50s)
  • 01:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-09

  • 23:28 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 23:28 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 23:27 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 23:27 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 23:26 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 23:26 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 23:25 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 23:25 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2007
  • 23:24 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2007
  • 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2006
  • 23:24 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2006
  • 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2006
  • 23:24 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2006
  • 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2005
  • 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2007
  • 23:23 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2006
  • 23:23 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2007
  • 23:23 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2006
  • 23:23 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2005
  • 23:23 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:23 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding logging-sd2005-7 to codfw - jhancock@cumin1003"
  • 23:23 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding logging-sd2005-7 to codfw - jhancock@cumin1003"
  • 23:19 jhancock@cumin1003: START - Cookbook sre.dns.netbox
  • 22:28 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 02m 02s)
  • 22:26 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 05m 59s)
  • 22:07 jhathaway@dns1004: END - running authdns-update
  • 22:06 jhathaway@dns1004: START - running authdns-update
  • 22:01 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on wdqs[1028-1032].eqiad.wmnet with reason: T410406
  • 21:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2213 (T410589)', diff saved to https://phabricator.wikimedia.org/P86487 and previous config saved to /var/cache/conftool/dbconfig/20251209-213205-ladsgroup.json
  • 21:31 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2213.codfw.wmnet with reason: Maintenance
  • 21:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T410589)', diff saved to https://phabricator.wikimedia.org/P86486 and previous config saved to /var/cache/conftool/dbconfig/20251209-213152-ladsgroup.json
  • 21:26 catrope@deploy2002: Finished scap sync-world: Backport for [ukwiki] Limit thanks for newbies to 3 per hour (T411588), [enwikibooks] Allow sysops to revert abusefilter and grant/revoke some flags (T411828) (duration: 07m 56s)
  • 21:24 taavi: run new CentralAuth:RecalculateGlobalEditCount.php on tokwiki
  • 21:22 catrope@deploy2002: superpes, catrope: Continuing with sync
  • 21:21 catrope@deploy2002: superpes, catrope: Backport for [ukwiki] Limit thanks for newbies to 3 per hour (T411588), [enwikibooks] Allow sysops to revert abusefilter and grant/revoke some flags (T411828) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:18 catrope@deploy2002: Started scap sync-world: Backport for [ukwiki] Limit thanks for newbies to 3 per hour (T411588), [enwikibooks] Allow sysops to revert abusefilter and grant/revoke some flags (T411828)
  • 21:16 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P86485 and previous config saved to /var/cache/conftool/dbconfig/20251209-211644-ladsgroup.json
  • 21:13 egardner@deploy2002: Finished scap sync-world: Backport for Backport: Instrument sticky header session length to 1.46.0-wmf.5 (T412146), Fix scroll-on-collapse (T411868 T411869), Fix heading background positioning (T412054) (duration: 08m 26s)
  • 21:09 egardner@deploy2002: egardner, ksarabia: Continuing with sync
  • 21:08 egardner@deploy2002: egardner, ksarabia: Backport for Backport: Instrument sticky header session length to 1.46.0-wmf.5 (T412146), Fix scroll-on-collapse (T411868 T411869), Fix heading background positioning (T412054) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:05 egardner@deploy2002: Started scap sync-world: Backport for Backport: Instrument sticky header session length to 1.46.0-wmf.5 (T412146), Fix scroll-on-collapse (T411868 T411869), Fix heading background positioning (T412054)
  • 21:01 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P86484 and previous config saved to /var/cache/conftool/dbconfig/20251209-210136-ladsgroup.json
  • 20:46 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T410589)', diff saved to https://phabricator.wikimedia.org/P86483 and previous config saved to /var/cache/conftool/dbconfig/20251209-204628-ladsgroup.json
  • 20:11 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2009.codfw.wmnet with OS trixie
  • 19:54 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
  • 19:48 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
  • 19:20 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
  • 18:47 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-e1-codfw.mgmt,ssw1-f1-codfw.mgmt with reason: upgradiing sr-linux on Nokia switches codfw
  • 18:32 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 17 hosts with reason: upgradiing sr-linux on Nokia switches codfw
  • 18:25 dzahn@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 18:24 dzahn@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 18:24 dzahn@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 18:24 dzahn@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 18:23 dzahn@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 18:22 dzahn@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 18:22 dzahn@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 18:21 dzahn@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 17:19 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 17:19 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 16:39 brett@dns1006: END - running authdns-update
  • 16:38 brett@dns1006: START - running authdns-update
  • 16:20 brett: Remove varnishkafka from trixie-wikimedia - T401832
  • 15:47 cdanis@dns3003: END - running authdns-update
  • 15:45 cdanis@dns3003: START - running authdns-update
  • 15:22 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:19 sbisson@deploy2002: Finished scap sync-world: Backport for Article search: surface nominated collections (JSON files) (T408842) (duration: 69m 26s)
  • 15:15 vgutierrez: restarting ATS on cp3074
  • 15:06 sbisson@deploy2002: sbisson: Continuing with sync
  • 15:05 sbisson@deploy2002: sbisson: Backport for Article search: surface nominated collections (JSON files) (T408842) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1229 gradually with 4 steps - Pooling in after cloning
  • 14:09 sbisson@deploy2002: Started scap sync-world: Backport for Article search: surface nominated collections (JSON files) (T408842)
  • 14:08 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-lab1001.eqiad.wmnet with reason: host reimage
  • 14:04 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:02 dpogorzelski@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-lab1001.eqiad.wmnet with reason: host reimage
  • 13:53 gehel: sudo cumin 'A:lvs-low-traffic-eqiad' 'systemctl restart pybal.service' - T406222
  • 13:48 gehel: sudo cumin 'A:lvs-secondary-eqiad' 'systemctl restart pybal.service' - T406222
  • 13:48 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
  • 13:47 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
  • 13:47 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
  • 13:47 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS trixie
  • 13:47 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
  • 13:45 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 13:45 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 13:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool db1229 gradually with 4 steps - Pooling in after cloning
  • 13:19 dpogorzelski@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-lab1001.eqiad.wmnet with OS trixie
  • 13:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (T410589)', diff saved to https://phabricator.wikimedia.org/P86471 and previous config saved to /var/cache/conftool/dbconfig/20251209-130640-ladsgroup.json
  • 13:06 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 13:04 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS trixie
  • 13:03 dpogorzelski@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-lab1001.eqiad.wmnet with OS trixie
  • 12:30 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS trixie
  • 12:09 dpogorzelski@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-lab1001.eqiad.wmnet with OS trixie
  • 10:58 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS trixie
  • 10:57 dpogorzelski@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-lab1001.eqiad.wmnet with OS trixie
  • 10:38 XioNoX: set port-speed on disabled Nokia interface
  • 10:30 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS trixie
  • 10:03 dpogorzelski@cumin1003: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from ml-lab1001 to ml-build1001
  • 10:03 dpogorzelski@cumin1003: START - Cookbook sre.hosts.rename from ml-lab1001 to ml-build1001
  • 09:53 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=ml-serve1013.eqiad.wmnet
  • 09:47 elukey@puppetserver1001: conftool action : set/pooled=true:weight=10; selector: name=ml-serve1013.eqiad.wmnet
  • 08:46 matthiasmullie: UTC morning backports done
  • 08:42 mlitn@deploy2002: Finished scap sync-world: Backport for Squashed diff to master (duration: 07m 34s)
  • 08:38 mlitn@deploy2002: mlitn: Continuing with sync
  • 08:36 mlitn@deploy2002: mlitn: Backport for Squashed diff to master synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:34 mlitn@deploy2002: Started scap sync-world: Backport for Squashed diff to master
  • 08:30 wmde-fisch@deploy2002: Finished scap sync-world: Backport for ext.wikimediaEvents: Add xLab impactTest experiment-specific instrument (T407570), VE: Don't create a synth ref when there's a LDR main ref (T411245) (duration: 08m 56s)
  • 08:26 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 08:26 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 08:25 wmde-fisch@deploy2002: wmde-fisch, sfaci: Continuing with sync
  • 08:23 wmde-fisch@deploy2002: wmde-fisch, sfaci: Backport for ext.wikimediaEvents: Add xLab impactTest experiment-specific instrument (T407570), VE: Don't create a synth ref when there's a LDR main ref (T411245) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:21 wmde-fisch@deploy2002: Started scap sync-world: Backport for ext.wikimediaEvents: Add xLab impactTest experiment-specific instrument (T407570), VE: Don't create a synth ref when there's a LDR main ref (T411245)
  • 05:48 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 05:48 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T410589)', diff saved to https://phabricator.wikimedia.org/P86465 and previous config saved to /var/cache/conftool/dbconfig/20251209-054822-ladsgroup.json
  • 05:33 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P86464 and previous config saved to /var/cache/conftool/dbconfig/20251209-053314-ladsgroup.json
  • 05:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P86463 and previous config saved to /var/cache/conftool/dbconfig/20251209-051806-ladsgroup.json
  • 05:04 eileen: civicrm upgraded from e0867392 to bdf84821
  • 05:03 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T410589)', diff saved to https://phabricator.wikimedia.org/P86462 and previous config saved to /var/cache/conftool/dbconfig/20251209-050258-ladsgroup.json
  • 05:02 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.3 (duration: 02m 44s)
  • 04:06 eileen: civicrm upgraded from f66aaff7 to e0867392
  • 02:19 eileen: civicrm upgraded from 86784b37 to f66aaff7
  • 01:18 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 17m 40s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 00:21 eileen: civicrm upgraded from 2dfecb38 to 86784b37

2025-12-08

2025-12-07

  • 21:49 eileen: civicrm upgraded from 9cc43ebd to 9ba062e3
  • 18:36 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 17:20 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 17:20 elukey@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 11:51 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
  • 11:51 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: sync
  • 02:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 02:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T410589)', diff saved to https://phabricator.wikimedia.org/P86442 and previous config saved to /var/cache/conftool/dbconfig/20251207-025120-ladsgroup.json
  • 02:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P86441 and previous config saved to /var/cache/conftool/dbconfig/20251207-023613-ladsgroup.json
  • 02:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P86440 and previous config saved to /var/cache/conftool/dbconfig/20251207-022105-ladsgroup.json
  • 02:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T410589)', diff saved to https://phabricator.wikimedia.org/P86439 and previous config saved to /var/cache/conftool/dbconfig/20251207-020558-ladsgroup.json
  • 01:18 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 17m 48s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-06

  • 14:47 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T410589)', diff saved to https://phabricator.wikimedia.org/P86436 and previous config saved to /var/cache/conftool/dbconfig/20251206-144719-ladsgroup.json
  • 14:47 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 03:47 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 03:47 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T410589)', diff saved to https://phabricator.wikimedia.org/P86435 and previous config saved to /var/cache/conftool/dbconfig/20251206-034700-ladsgroup.json
  • 03:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P86434 and previous config saved to /var/cache/conftool/dbconfig/20251206-033152-ladsgroup.json
  • 03:16 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P86433 and previous config saved to /var/cache/conftool/dbconfig/20251206-031644-ladsgroup.json
  • 03:01 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T410589)', diff saved to https://phabricator.wikimedia.org/P86432 and previous config saved to /var/cache/conftool/dbconfig/20251206-030136-ladsgroup.json
  • 01:18 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 17m 22s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-05

  • 22:35 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 22:34 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 22:34 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 22:33 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 22:32 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 22:31 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 22:29 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 22:29 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 22:11 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 22:11 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 22:11 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 22:10 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 21:57 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 21:56 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 21:49 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 21:38 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 21:19 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 21:18 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 21:03 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
  • 21:03 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
  • 20:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 20:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 20:06 ejegg: donorwiki upgraded from 9ab44e85 to bbd96c00
  • 19:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 19:49 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 19:10 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 19:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 18:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 18:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 18:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
  • 18:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
  • 18:10 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 18:10 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 17:23 topranks: add updated ssh firewall filter config to pfw1-eqiad.wikimedia.org T390939
  • 17:11 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 17:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 17:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 17:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 17:07 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.provision (exit_code=97) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 17:02 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 17:02 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 16:52 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 16:03 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 16:03 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 15:30 Amir1: creating ores tables on thwiki (T409438)
  • 15:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1189 (T410589)', diff saved to https://phabricator.wikimedia.org/P86429 and previous config saved to /var/cache/conftool/dbconfig/20251205-150737-ladsgroup.json
  • 15:07 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 15:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T410589)', diff saved to https://phabricator.wikimedia.org/P86428 and previous config saved to /var/cache/conftool/dbconfig/20251205-150713-ladsgroup.json
  • 14:56 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
  • 14:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
  • 14:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
  • 14:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P86427 and previous config saved to /var/cache/conftool/dbconfig/20251205-145206-ladsgroup.json
  • 14:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
  • 14:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
  • 14:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
  • 14:46 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 14:45 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 14:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P86426 and previous config saved to /var/cache/conftool/dbconfig/20251205-143658-ladsgroup.json
  • 14:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
  • 14:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
  • 14:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/ferretdb-growthbook: apply
  • 14:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/ferretdb-growthbook: apply
  • 14:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook: apply
  • 14:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook: apply
  • 14:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T410589)', diff saved to https://phabricator.wikimedia.org/P86425 and previous config saved to /var/cache/conftool/dbconfig/20251205-142150-ladsgroup.json
  • 14:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
  • 14:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
  • 14:08 jayme: stopped puppet on wikikube-ctrl2* and restarted kube-apiserver to temporarily extend audit logging
  • 13:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
  • 13:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
  • 13:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
  • 13:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
  • 13:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/ferretdb-growthbook-next: apply
  • 13:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/ferretdb-growthbook-next: apply
  • 13:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
  • 13:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
  • 13:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
  • 13:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
  • 13:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/ferretdb-growthbook: apply
  • 13:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/ferretdb-growthbook: apply
  • 13:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook: apply
  • 13:33 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
  • 13:30 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
  • 13:10 moritzm: upload python3-sshpubkeys to 3.3.1-1~wmf12u1 to apt.wikimedia.org T411816
  • 12:42 moritzm: upgrade python3-sshpubkeys on idm-test1001 to 3.3.1-1~wmf12u1 T411816
  • 12:30 jayme: removed helm release mw-script/utk6lsuw in k8s@codfw which was in stuck in pending-install state since 9+ days
  • 11:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
  • 11:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
  • 11:42 lucaswerkmeister-wmde@deploy2002: kubectl delete job wikidata-resubmit-changes-for-dispatch-29415459 # T411862
  • 11:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1233.eqiad.wmnet onto db1229.eqiad.wmnet
  • 11:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1233 gradually with 4 steps - Pool db1233.eqiad.wmnet in after cloning
  • 10:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool db1233 gradually with 4 steps - Pool db1233.eqiad.wmnet in after cloning
  • 10:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
  • 10:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
  • 09:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
  • 09:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
  • 09:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1233 - Depool db1233.eqiad.wmnet to then clone it to db1229.eqiad.wmnet - fceratto@cumin1003
  • 09:16 fceratto@cumin1003: START - Cookbook sre.mysql.depool db1233 - Depool db1233.eqiad.wmnet to then clone it to db1229.eqiad.wmnet - fceratto@cumin1003
  • 09:16 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db1233.eqiad.wmnet onto db1229.eqiad.wmnet
  • 08:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
  • 08:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
  • 08:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/ferretdb-growthbook-next: apply
  • 08:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/ferretdb-growthbook-next: apply
  • 08:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
  • 08:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
  • 07:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
  • 07:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
  • 03:24 larssandergreen: Updating civicrm from 7a979750 to 9cc43ebd
  • 03:08 larssandergreen: Updating civicrm from 36b09796 to 7a979750
  • 02:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T410589)', diff saved to https://phabricator.wikimedia.org/P86417 and previous config saved to /var/cache/conftool/dbconfig/20251205-025711-ladsgroup.json
  • 02:57 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 02:56 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T410589)', diff saved to https://phabricator.wikimedia.org/P86416 and previous config saved to /var/cache/conftool/dbconfig/20251205-025647-ladsgroup.json
  • 02:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P86415 and previous config saved to /var/cache/conftool/dbconfig/20251205-024139-ladsgroup.json
  • 02:40 ejegg: payments-wiki upgraded from 9ab44e85 to 5c381b45
  • 02:26 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P86414 and previous config saved to /var/cache/conftool/dbconfig/20251205-022631-ladsgroup.json
  • 02:11 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T410589)', diff saved to https://phabricator.wikimedia.org/P86413 and previous config saved to /var/cache/conftool/dbconfig/20251205-021123-ladsgroup.json
  • 02:09 wfan: donorwiki upgraded from 053b3f88 to 9ab44e85
  • 02:07 wfan: payments-wiki upgraded from d2799b95 to 9ab44e85
  • 02:01 rzl: rzl@apt1002:~$ sudo -i reprepro -C component/envoy-future include bullseye-wikimedia /home/rzl/envoyproxy_1.35.7-1_amd64.changes
  • 01:44 wfan: SmashPig upgraded from a25fbb28 to 1442d0a0
  • 01:41 eileen: civicrm upgraded from d4bd9b1b to 36b09796
  • 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 06s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 00:27 Amir1: ladsgroup@deploy2002:~$ mwscript-k8s --follow -- findBadBlobs.php --wiki huwikiquote --mark "Corrupted UTF-8 (T351953)" --revisions 3804,3808,3811,3813,3814,3818,3825
  • 00:26 Amir1: ladsgroup@deploy2002:~$ mwscript-k8s --follow -- findBadBlobs.php --wiki guwiktionary --mark "Corrupted UTF-8 (T351953)" --revisions 20576

2025-12-04

  • 23:47 tzatziki: removing 4 files for legal compliance
  • 23:34 tzatziki: removing 2 files for legal compliance
  • 23:23 tzatziki: removing 3 files for legal compliance
  • 23:16 ryankemper@cumin2002: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop test cluster
  • 23:16 tzatziki: removing 5 files for legal compliance
  • 23:04 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7002.magru.wmnet
  • 23:02 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7001.magru.wmnet
  • 23:00 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7002.magru.wmnet
  • 23:00 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6002.drmrs.wmnet
  • 22:59 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6001.drmrs.wmnet
  • 22:59 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5002.eqsin.wmnet
  • 22:58 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7001.magru.wmnet
  • 22:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5001.eqsin.wmnet
  • 22:56 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6002.drmrs.wmnet
  • 22:55 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6001.drmrs.wmnet
  • 22:55 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5002.eqsin.wmnet
  • 22:55 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3002.esams.wmnet
  • 22:55 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4002.ulsfo.wmnet
  • 22:52 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3001.esams.wmnet
  • 22:52 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5001.eqsin.wmnet
  • 22:51 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2002.codfw.wmnet
  • 22:51 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4002.ulsfo.wmnet
  • 22:51 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3002.esams.wmnet
  • 22:51 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4001.ulsfo.wmnet
  • 22:51 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2001.codfw.wmnet
  • 22:50 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4001.ulsfo.wmnet
  • 22:50 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1002.eqiad.wmnet
  • 22:49 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3001.esams.wmnet
  • 22:48 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2002.codfw.wmnet
  • 22:47 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2001.codfw.wmnet
  • 22:46 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1002.eqiad.wmnet
  • 22:42 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
  • 22:42 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-cluster
  • 22:37 sbassett: Deployed security fix for T409226
  • 22:35 ryankemper@cumin2002: START - Cookbook sre.hadoop.reboot-workers for Hadoop test cluster
  • 22:28 sbassett: Deployed security fix for T408135
  • 22:22 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: T408532
  • 22:20 ryankemper: T411568 Rebooting `stat*`
  • 22:11 ryankemper@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on stat[1008-1011].eqiad.wmnet with reason: T411568
  • 22:06 cscott@deploy2002: Finished scap sync-world: Backport for Activate postprocessing cache on testwiki, test2wiki, officewiki (T348255) (duration: 14m 23s)
  • 22:02 cscott@deploy2002: ihurbain, cscott: Continuing with sync
  • 21:54 cscott@deploy2002: ihurbain, cscott: Backport for Activate postprocessing cache on testwiki, test2wiki, officewiki (T348255) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:52 cscott@deploy2002: Started scap sync-world: Backport for Activate postprocessing cache on testwiki, test2wiki, officewiki (T348255)
  • 21:45 jforrester@deploy2002: Finished scap sync-world: Backport for Followup Ie40b9e59a4: Fortify unified metrics method (T411793) (duration: 07m 16s)
  • 21:40 jforrester@deploy2002: jforrester: Continuing with sync
  • 21:40 jforrester@deploy2002: jforrester: Backport for Followup Ie40b9e59a4: Fortify unified metrics method (T411793) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:37 jforrester@deploy2002: Started scap sync-world: Backport for Followup Ie40b9e59a4: Fortify unified metrics method (T411793)
  • 21:24 jforrester@deploy2002: Finished scap sync-world: Backport for [tokwiki] Allow sysops to grant/remove confirmed status (T411683), OATHAuth: Remove wmgOATHAuthDisableRight (T399664), Remove /data-parsoid/ endpoint from specs per T393557 (T411517), Shorten 'close' cookie wait period for enwiki banners (T411800) (duration: 10m 04s)
  • 21:19 jforrester@deploy2002: mstyles, aaron, superpes, jforrester, ejegg: Continuing with sync
  • 21:18 jforrester@deploy2002: mstyles, aaron, superpes, jforrester, ejegg: Backport for [tokwiki] Allow sysops to grant/remove confirmed status (T411683), OATHAuth: Remove wmgOATHAuthDisableRight (T399664), Remove /data-parsoid/ endpoint from specs per T393557 (T411517), Shorten 'close' cookie wait period for enwiki banners (T411800) synced to the t
  • 21:14 jforrester@deploy2002: Started scap sync-world: Backport for [tokwiki] Allow sysops to grant/remove confirmed status (T411683), OATHAuth: Remove wmgOATHAuthDisableRight (T399664), Remove /data-parsoid/ endpoint from specs per T393557 (T411517), Shorten 'close' cookie wait period for enwiki banners (T411800)
  • 21:11 kharlan@deploy2002: Finished scap sync-world: Backport for Use a separate right for Special:SuggestedInvestigations (T411557) (duration: 57m 45s)
  • 21:03 brett: import varnishkafka 1.2.0~deb13+wmf1 into trixie-wikimedia - T401832
  • 21:01 taavi@deploy2002: mwscript-k8s job started: initEditCount --wiki=tokwiki
  • 20:58 kharlan@deploy2002: kharlan: Continuing with sync
  • 20:57 kharlan@deploy2002: kharlan: Backport for Use a separate right for Special:SuggestedInvestigations (T411557) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:50 brett: import libvmod-wmfuniq 0.2.0~deb13+wmf1 into trixie-wikimedia - T401832
  • 20:28 brett: Delete libvmod-netmapper 1.10-1~deb13+wmf1, import libvmod-netmapper 1.10~deb13+wmf1 into trixie-wikimedia - T401832
  • 20:13 kharlan@deploy2002: Started scap sync-world: Backport for Use a separate right for Special:SuggestedInvestigations (T411557)
  • 20:13 brett: import libvmod-querysort 0.4~deb13+wmf1 into trixie-wikimedia - T401832
  • 20:05 cstone: payments-wiki upgraded from 714ed4cf to d2799b95
  • 20:00 brett: import libvmod-netmapper 1.10-1~deb13+wmf1 into trixie-wikimedia - T401832
  • 19:30 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Persist the captcha consequence in the user session (T410657) (duration: 11m 16s)
  • 19:24 kharlan@deploy2002: kharlan: Continuing with sync
  • 19:21 kharlan@deploy2002: kharlan: Backport for hCaptcha: Persist the captcha consequence in the user session (T410657) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 19:19 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Persist the captcha consequence in the user session (T410657)
  • 19:13 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
  • 19:12 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
  • 18:50 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
  • 18:50 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
  • 18:46 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 18:45 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 18:22 ejegg: fundraising civicrm rolled back from 510ab862 to d4bd9b1b
  • 18:21 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
  • 18:21 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
  • 18:09 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 18:09 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 18:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1019.eqiad.wmnet with OS bullseye
  • 17:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: host reimage
  • 17:45 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1019.eqiad.wmnet with reason: host reimage
  • 17:44 ejegg: fundraising civicrm upgraded from d4bd9b1b to 510ab862
  • 17:30 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1019.eqiad.wmnet with OS bullseye
  • 17:21 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host franio1004
  • 17:21 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host franio1004
  • 17:20 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:17 vriley@cumin1003: START - Cookbook sre.dns.netbox
  • 17:06 topranks: disable BGP to lvs1019 on eqiad coure routers ahead of switch migration T405628
  • 17:06 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: move primary uplink from move primary uplink from asw2-c7-eqiad to lsw1-c7-eqiad and remove link to asw2-d2-eqiad - T405628
  • 15:55 hashar@deploy2002: Finished deploy [gerrit/gerrit@121bd1c]: Remove duplicate [DISMISS] button (duration: 00m 11s)
  • 15:55 hashar@deploy2002: Started deploy [gerrit/gerrit@121bd1c]: Remove duplicate [DISMISS] button
  • 15:51 dpogorzelski@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ml-lab1001.eqiad.wmnet with reason: decomission
  • 15:50 dpogorzelski@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on ml-lab1001.eqiad.wmnet with reason: decomission
  • 15:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host conf2005.codfw.wmnet
  • 15:45 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2003.codfw.wmnet
  • 15:45 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2003.codfw.wmnet
  • 15:44 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host conf2005.codfw.wmnet
  • 15:43 hashar@deploy2002: Finished deploy [gerrit/gerrit@774e2ff]: Ease configuration of the motd banner && Add banner for the 2025 developer survey (duration: 00m 15s)
  • 15:43 hashar@deploy2002: Started deploy [gerrit/gerrit@774e2ff]: Ease configuration of the motd banner && Add banner for the 2025 developer survey
  • 15:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host conf2004.codfw.wmnet
  • 15:38 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 15:38 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 15:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host conf2004.codfw.wmnet
  • 15:35 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2003.codfw.wmnet
  • 15:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host conf1009.eqiad.wmnet
  • 15:30 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2003.codfw.wmnet
  • 15:28 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host conf1009.eqiad.wmnet
  • 15:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host conf1008.eqiad.wmnet
  • 15:20 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host conf1008.eqiad.wmnet
  • 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host conf1007.eqiad.wmnet
  • 15:09 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host conf1007.eqiad.wmnet
  • 15:08 cgoubert@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 15:06 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 15:06 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 15:06 cgoubert@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 15:05 cgoubert@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:03 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:03 cgoubert@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 15:03 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 15:02 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 15:02 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 15:02 ladsgroup@deploy2002: Finished scap sync-world: Backport for RevisionStore: Catch ParameterAssertionException too (T351953) (duration: 09m 26s)
  • 15:01 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 14:59 cgoubert@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 14:59 cgoubert@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
  • 14:59 cgoubert@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:58 cgoubert@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:55 ladsgroup@deploy2002: jforrester, ladsgroup: Continuing with sync
  • 14:54 ladsgroup@deploy2002: jforrester, ladsgroup: Backport for RevisionStore: Catch ParameterAssertionException too (T351953) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:52 ladsgroup@deploy2002: Started scap sync-world: Backport for RevisionStore: Catch ParameterAssertionException too (T351953)
  • 14:50 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
  • 14:49 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
  • 14:37 derick@deploy2002: Finished scap sync-world: Backport for Revert "User: Log where the data was loaded when CAS update failed" (T410652), Revert "User: Log where the data was loaded when CAS update failed" (T410652), Fetch user object from primary DB (for writes) not replica DB (T410652) (duration: 13m 24s)
  • 14:27 derick@deploy2002: d3r1ck01, derick: Continuing with sync
  • 14:26 derick@deploy2002: d3r1ck01, derick: Backport for Revert "User: Log where the data was loaded when CAS update failed" (T410652), Revert "User: Log where the data was loaded when CAS update failed" (T410652), Fetch user object from primary DB (for writes) not replica DB (T410652) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
  • 14:23 derick@deploy2002: Started scap sync-world: Backport for Revert "User: Log where the data was loaded when CAS update failed" (T410652), Revert "User: Log where the data was loaded when CAS update failed" (T410652), Fetch user object from primary DB (for writes) not replica DB (T410652)
  • 14:17 gehel@cumin2002: conftool action : set/weight=10; selector: service=druid-public-coordinator
  • 14:17 gehel@cumin2002: conftool action : set/pooled=yes; selector: service=druid-public-coordinator
  • 14:14 tchanders@deploy2002: Finished scap sync-world: Backport for Enable temporary accounts on enwikinews and ptwikibooks (T411618) (duration: 10m 36s)
  • 14:11 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1166 (T410589)', diff saved to https://phabricator.wikimedia.org/P86406 and previous config saved to /var/cache/conftool/dbconfig/20251204-141124-ladsgroup.json
  • 14:11 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 14:11 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T410589)', diff saved to https://phabricator.wikimedia.org/P86405 and previous config saved to /var/cache/conftool/dbconfig/20251204-141101-ladsgroup.json
  • 14:08 tchanders@deploy2002: tchanders: Continuing with sync
  • 14:06 tchanders@deploy2002: tchanders: Backport for Enable temporary accounts on enwikinews and ptwikibooks (T411618) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:03 tchanders@deploy2002: Started scap sync-world: Backport for Enable temporary accounts on enwikinews and ptwikibooks (T411618)
  • 13:55 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P86404 and previous config saved to /var/cache/conftool/dbconfig/20251204-135554-ladsgroup.json
  • 13:40 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P86403 and previous config saved to /var/cache/conftool/dbconfig/20251204-134046-ladsgroup.json
  • 13:25 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T410589)', diff saved to https://phabricator.wikimedia.org/P86402 and previous config saved to /var/cache/conftool/dbconfig/20251204-132539-ladsgroup.json
  • 13:22 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 13:22 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 13:19 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 13:19 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 13:16 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 13:15 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 13:15 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 13:14 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 13:07 moritzm: installing waitress security updates
  • 12:45 moritzm: installing postgresql-15 security updates
  • 11:31 moritzm: installing net-snmp security updates
  • 11:21 moritzm: rebuild software RAIDs on T410743
  • 11:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
  • 10:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
  • 09:48 moritzm: upgrade Envoy on an-launcher T405808
  • 09:43 hashar@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.5 refs T408275
  • 09:35 moritzm: cleanup lingering sessions of offboarded user T389324
  • 09:30 hashar@deploy2002: Finished scap sync-world: Backport for REST: add explicit cast to sitemapSize calcuation to avoid warning (T411580), Followup I81a2c4de77: Verify stats label values are not empty (T411585) (duration: 09m 59s)
  • 09:26 hashar@deploy2002: jforrester, hashar: Continuing with sync
  • 09:23 hashar@deploy2002: jforrester, hashar: Backport for REST: add explicit cast to sitemapSize calcuation to avoid warning (T411580), Followup I81a2c4de77: Verify stats label values are not empty (T411585) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:22 arnoldokoth: upgrade envoyproxy on lists T405808
  • 09:20 hashar@deploy2002: Started scap sync-world: Backport for REST: add explicit cast to sitemapSize calcuation to avoid warning (T411580), Followup I81a2c4de77: Verify stats label values are not empty (T411585)
  • 09:20 arnoldokoth: upgrade envoyproxy on vrts T405808
  • 09:19 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Arinaigum out of all services on: 2419 hosts
  • 03:50 ejegg: fundraising civicrm upgraded from b1fc5afc to d4bd9b1b
  • 01:23 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T410589)', diff saved to https://phabricator.wikimedia.org/P86394 and previous config saved to /var/cache/conftool/dbconfig/20251204-012321-ladsgroup.json
  • 01:23 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 01:18 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 17m 47s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image

2025-12-03

  • 23:08 Amir1: hard rebooting codesearch9.codesearch.eqiad1.wikimedia.cloud (T411728)
  • 22:51 mutante: maintenance on https://codesearch.wmcloud.org/ - trying to fix disk space issue - detaching volume to extend it
  • 22:50 mutante: maintenance on https://codesearch.wmcloud.org/ - trying to fix disk space issue
  • 22:33 ryankemper@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 22:33 ryankemper@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 22:14 ryankemper@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 22:13 ryankemper@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 22:09 ryankemper@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 22:08 ryankemper@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 21:53 aaron@deploy2002: Finished scap sync-world: Backport for Update Math API title and project-specific /math/ endpoint stability policy (T411517) (duration: 08m 25s)
  • 21:49 aaron@deploy2002: aaron: Continuing with sync
  • 21:47 aaron@deploy2002: aaron: Backport for Update Math API title and project-specific /math/ endpoint stability policy (T411517) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:45 aaron@deploy2002: Started scap sync-world: Backport for Update Math API title and project-specific /math/ endpoint stability policy (T411517)
  • 21:42 derick@deploy2002: Finished scap sync-world: Backport for User: Log where the data was loaded when CAS update failed (T410652), User: Log where the data was loaded when CAS update failed (T410652) (duration: 07m 33s)
  • 21:38 derick@deploy2002: derick, d3r1ck01: Continuing with sync
  • 21:37 derick@deploy2002: derick, d3r1ck01: Backport for User: Log where the data was loaded when CAS update failed (T410652), User: Log where the data was loaded when CAS update failed (T410652) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:35 derick@deploy2002: Started scap sync-world: Backport for User: Log where the data was loaded when CAS update failed (T410652), User: Log where the data was loaded when CAS update failed (T410652)
  • 21:28 dani@deploy2002: Finished scap sync-world: Backport for Increase coverage of 2025 Global Readers Survey (non-enwiki) (T410918), OATHAuth: Expand 2FA to all users (T399664) (duration: 11m 18s)
  • 21:24 dani@deploy2002: dani, mstyles: Continuing with sync
  • 21:19 dani@deploy2002: dani, mstyles: Backport for Increase coverage of 2025 Global Readers Survey (non-enwiki) (T410918), OATHAuth: Expand 2FA to all users (T399664) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:17 dani@deploy2002: Started scap sync-world: Backport for Increase coverage of 2025 Global Readers Survey (non-enwiki) (T410918), OATHAuth: Expand 2FA to all users (T399664)
  • 21:14 aude@deploy2002: Finished scap sync-world: Backport for [Legal Footer] Create config for adding legal footer (T410163) (duration: 08m 38s)
  • 21:10 aude@deploy2002: aude, lmora: Continuing with sync
  • 21:08 aude@deploy2002: aude, lmora: Backport for [Legal Footer] Create config for adding legal footer (T410163) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:05 aude@deploy2002: Started scap sync-world: Backport for [Legal Footer] Create config for adding legal footer (T410163)
  • 20:53 aqu@deploy2002: Finished deploy [analytics/refinery@6dfb3b8] (thin): Deploy spur hqls THIN [analytics/refinery@6dfb3b8b] (duration: 01m 16s)
  • 20:51 aqu@deploy2002: Started deploy [analytics/refinery@6dfb3b8] (thin): Deploy spur hqls THIN [analytics/refinery@6dfb3b8b]
  • 20:51 aqu@deploy2002: Finished deploy [analytics/refinery@6dfb3b8]: Deploy spur hqls [analytics/refinery@6dfb3b8b] (duration: 02m 29s)
  • 20:49 aqu@deploy2002: Started deploy [analytics/refinery@6dfb3b8]: Deploy spur hqls [analytics/refinery@6dfb3b8b]
  • 20:48 aqu@deploy2002: Finished deploy [analytics/refinery@6dfb3b8] (hadoop-test): Deploy spur hqls TEST [analytics/refinery@6dfb3b8b] (duration: 01m 01s)
  • 20:47 aqu@deploy2002: Started deploy [analytics/refinery@6dfb3b8] (hadoop-test): Deploy spur hqls TEST [analytics/refinery@6dfb3b8b]
  • 20:44 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudservices1005.eqiad.wmnet with reason: host reimage
  • 20:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1020.eqiad.wmnet
  • 20:43 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1020.eqiad.wmnet
  • 20:40 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudservices1005.eqiad.wmnet with reason: host reimage
  • 20:25 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudservices1005.eqiad.wmnet with OS trixie
  • 20:22 eileen: civicrm upgraded from 45931830 to b1fc5afc
  • 20:02 ejegg: payments-wiki upgraded from eeadc2d8 to 714ed4cf
  • 20:00 eileen: civicrm upgraded from c6d1f24b to 45931830
  • 19:58 sukhe@dns1004: END - running authdns-update
  • 19:57 sukhe@dns1004: START - running authdns-update
  • 19:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T410589)', diff saved to https://phabricator.wikimedia.org/P86392 and previous config saved to /var/cache/conftool/dbconfig/20251203-195207-ladsgroup.json
  • 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1020.eqiad.wmnet with OS bullseye
  • 19:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P86390 and previous config saved to /var/cache/conftool/dbconfig/20251203-193659-ladsgroup.json
  • 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1020.eqiad.wmnet with reason: host reimage
  • 19:23 hashar@deploy2002: Finished deploy [gerrit/gerrit@93bde2a]: Ease configuration of the motd banner (duration: 00m 09s)
  • 19:22 hashar@deploy2002: Started deploy [gerrit/gerrit@93bde2a]: Ease configuration of the motd banner
  • 19:22 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cf (exit_code=0)
  • 19:22 cmooney@cumin1003: START - Cookbook sre.network.cf
  • 19:22 topranks: disabling remote announcement of bgp prefixes
  • 19:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P86388 and previous config saved to /var/cache/conftool/dbconfig/20251203-192152-ladsgroup.json
  • 19:21 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1020.eqiad.wmnet with reason: host reimage
  • 19:14 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudservices1006.eqiad.wmnet with OS trixie
  • 19:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T410589)', diff saved to https://phabricator.wikimedia.org/P86387 and previous config saved to /var/cache/conftool/dbconfig/20251203-190644-ladsgroup.json
  • 19:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1020.eqiad.wmnet with OS bullseye
  • 18:37 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirtlocal1001.eqiad.wmnet with OS trixie
  • 18:26 ladsgroup@deploy2002: Finished scap sync-world: Backport for findBadBlobs: Fix the --scan-to option (T351953), findBadBlobs: Fix the --scan-to option (T351953) (duration: 06m 48s)
  • 18:25 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1020.eqiad.wmnet with reason: move primary uplink from asw2-d7-eqiad to lsw1-d7-eqiad and remove link to asw2-c2-eqiad
  • 18:22 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 18:22 ladsgroup@deploy2002: ladsgroup: Backport for findBadBlobs: Fix the --scan-to option (T351953), findBadBlobs: Fix the --scan-to option (T351953) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:19 ladsgroup@deploy2002: Started scap sync-world: Backport for findBadBlobs: Fix the --scan-to option (T351953), findBadBlobs: Fix the --scan-to option (T351953)
  • 18:12 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudservices1006.eqiad.wmnet with reason: host reimage
  • 18:08 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudservices1006.eqiad.wmnet with reason: host reimage
  • 18:05 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:05 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating for cloudceph to codfw - jhancock@cumin1003"
  • 18:04 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating for cloudceph to codfw - jhancock@cumin1003"
  • 18:01 jhancock@cumin1003: START - Cookbook sre.dns.netbox
  • 18:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
  • 17:57 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
  • 17:50 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudservices1006.eqiad.wmnet with OS trixie
  • 17:46 sukhe@cumin1003: END (PASS) - Cookbook sre.network.cf (exit_code=0)
  • 17:46 sukhe@cumin1003: START - Cookbook sre.network.cf
  • 17:46 sukhe@cumin1003: END (FAIL) - Cookbook sre.network.cf (exit_code=1)
  • 17:46 sukhe@cumin1003: START - Cookbook sre.network.cf
  • 17:46 sukhe@cumin1003: END (PASS) - Cookbook sre.network.cf (exit_code=0)
  • 17:46 sukhe@cumin1003: START - Cookbook sre.network.cf
  • 17:46 sukhe@cumin1003: END (PASS) - Cookbook sre.network.cf (exit_code=0)
  • 17:45 sukhe@cumin1003: START - Cookbook sre.network.cf
  • 17:40 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS trixie
  • 17:40 sbisson@deploy2002: Finished scap sync-world: Backport for CX3 Build 1.0.0+20251126 (T384485) (duration: 09m 07s)
  • 17:36 sbisson@deploy2002: sbisson: Continuing with sync
  • 17:34 sbisson@deploy2002: sbisson: Backport for CX3 Build 1.0.0+20251126 (T384485) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:31 sbisson@deploy2002: Started scap sync-world: Backport for CX3 Build 1.0.0+20251126 (T384485)
  • 17:11 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1229.eqiad.wmnet with reason: crashed
  • 17:07 jynus@cumin1003: dbctl commit (dc=all): 'Depooldb1229', diff saved to https://phabricator.wikimedia.org/P86383 and previous config saved to /var/cache/conftool/dbconfig/20251203-170745-jynus.json
  • 17:02 bd808@deploy2002: Finished scap sync-world: Backport for robots.php: Fix undefined index 'enabled' on Wikinews and closed wikis (T411632) (duration: 07m 40s)
  • 16:58 bd808@deploy2002: bd808, krinkle: Continuing with sync
  • 16:57 bd808@deploy2002: bd808, krinkle: Backport for robots.php: Fix undefined index 'enabled' on Wikinews and closed wikis (T411632) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 16:54 bd808@deploy2002: Started scap sync-world: Backport for robots.php: Fix undefined index 'enabled' on Wikinews and closed wikis (T411632)
  • 16:49 bd808@deploy2002: Finished scap sync-world: Backport for officewiki: Put indicators in title with vector-2022, officewiki: Enable page protection indicators (duration: 07m 47s)
  • 16:45 bd808@deploy2002: bd808: Continuing with sync
  • 16:44 bd808@deploy2002: bd808: Backport for officewiki: Put indicators in title with vector-2022, officewiki: Enable page protection indicators synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 16:41 bd808@deploy2002: Started scap sync-world: Backport for officewiki: Put indicators in title with vector-2022, officewiki: Enable page protection indicators
  • 16:15 topranks: disabling unused former cloudcephosd hosts on cloud switches T410989
  • 16:13 dancy@deploy2002: Installation of scap version "4.229.0" completed for 164 hosts
  • 16:09 dancy@deploy2002: Installing scap version "4.229.0" for 164 host(s)
  • 15:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host conf2006.codfw.wmnet
  • 15:28 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:27 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:27 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:27 ladsgroup@deploy2002: Finished scap sync-world: Backport for Clean up db groups config (T411088) (duration: 07m 48s)
  • 15:27 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host conf2006.codfw.wmnet
  • 15:26 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:26 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:23 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 15:23 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:22 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:21 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:21 ladsgroup@deploy2002: ladsgroup: Backport for Clean up db groups config (T411088) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:21 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:20 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:20 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:19 ladsgroup@deploy2002: Started scap sync-world: Backport for Clean up db groups config (T411088)
  • 15:16 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
  • 15:16 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
  • 15:15 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:15 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:14 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:13 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:12 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:12 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:09 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:08 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:08 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:07 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:06 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
  • 15:06 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:04 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
  • 15:03 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
  • 15:00 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on alert1002.wikimedia.org with reason: C/D Migration
  • 15:00 robh: alert1002 port migration now starting
  • 14:54 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:49 esanders@deploy2002: Finished scap sync-world: Backport for DiscussionTools: cleanup unused config, Remove wgVisualEditorEditCheckSingleCheckMode (duration: 06m 44s)
  • 14:45 esanders@deploy2002: esanders: Continuing with sync
  • 14:44 esanders@deploy2002: esanders: Backport for DiscussionTools: cleanup unused config, Remove wgVisualEditorEditCheckSingleCheckMode synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:42 esanders@deploy2002: Started scap sync-world: Backport for DiscussionTools: cleanup unused config, Remove wgVisualEditorEditCheckSingleCheckMode
  • 14:38 esanders@deploy2002: Finished scap sync-world: Backport for Set Flow to read-only everywhere (T402552) (duration: 09m 44s)
  • 14:33 esanders@deploy2002: esanders: Continuing with sync
  • 14:31 esanders@deploy2002: esanders: Backport for Set Flow to read-only everywhere (T402552) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:29 esanders@deploy2002: Started scap sync-world: Backport for Set Flow to read-only everywhere (T402552)
  • 14:27 XioNoX: push pfw policies - T411566
  • 14:27 sbisson@deploy2002: Finished scap sync-world: Backport for CX3 Build 1.0.0+20251201 (T408842 T408844) (duration: 12m 01s)
  • 14:21 sbisson@deploy2002: sbisson: Continuing with sync
  • 14:17 sbisson@deploy2002: sbisson: Backport for CX3 Build 1.0.0+20251201 (T408842 T408844) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:15 sbisson@deploy2002: Started scap sync-world: Backport for CX3 Build 1.0.0+20251201 (T408842 T408844)
  • 13:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
  • 13:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86380 and previous config saved to /var/cache/conftool/dbconfig/20251203-135000-marostegui.json
  • 13:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P86379 and previous config saved to /var/cache/conftool/dbconfig/20251203-133452-marostegui.json
  • 13:32 kart_: Updated Recommendation API to 2025-12-02-200719-production (T408845, T408844, T384485)
  • 13:30 kartik@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:25 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:22 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P86378 and previous config saved to /var/cache/conftool/dbconfig/20251203-131945-marostegui.json
  • 13:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2229 (T410589)', diff saved to https://phabricator.wikimedia.org/P86377 and previous config saved to /var/cache/conftool/dbconfig/20251203-131448-ladsgroup.json
  • 13:14 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
  • 13:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T410589)', diff saved to https://phabricator.wikimedia.org/P86376 and previous config saved to /var/cache/conftool/dbconfig/20251203-131435-ladsgroup.json
  • 13:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86375 and previous config saved to /var/cache/conftool/dbconfig/20251203-130437-marostegui.json
  • 13:01 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 13:00 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 13:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2227 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86374 and previous config saved to /var/cache/conftool/dbconfig/20251203-130002-marostegui.json
  • 12:59 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
  • 12:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86373 and previous config saved to /var/cache/conftool/dbconfig/20251203-125938-marostegui.json
  • 12:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P86372 and previous config saved to /var/cache/conftool/dbconfig/20251203-125927-ladsgroup.json
  • 12:57 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 12:56 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 12:56 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 12:55 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 12:54 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 12:53 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 12:52 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:52 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 12:51 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 12:51 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 12:50 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:50 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 12:50 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:49 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 12:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P86371 and previous config saved to /var/cache/conftool/dbconfig/20251203-124430-marostegui.json
  • 12:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P86370 and previous config saved to /var/cache/conftool/dbconfig/20251203-124419-ladsgroup.json
  • 12:32 claime: Restarting failed timer dump_cloud_ip_ranges on puppetservers
  • 12:30 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 12:30 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 12:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P86369 and previous config saved to /var/cache/conftool/dbconfig/20251203-122923-marostegui.json
  • 12:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T410589)', diff saved to https://phabricator.wikimedia.org/P86368 and previous config saved to /var/cache/conftool/dbconfig/20251203-122912-ladsgroup.json
  • 12:26 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 12:26 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 12:20 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 12:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 12:19 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 12:18 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:17 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 12:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86367 and previous config saved to /var/cache/conftool/dbconfig/20251203-121409-marostegui.json
  • 12:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2209 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86366 and previous config saved to /var/cache/conftool/dbconfig/20251203-120933-marostegui.json
  • 12:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 12:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86365 and previous config saved to /var/cache/conftool/dbconfig/20251203-120909-marostegui.json
  • 11:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P86364 and previous config saved to /var/cache/conftool/dbconfig/20251203-115401-marostegui.json
  • 11:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P86363 and previous config saved to /var/cache/conftool/dbconfig/20251203-113853-marostegui.json
  • 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86362 and previous config saved to /var/cache/conftool/dbconfig/20251203-112345-marostegui.json
  • 11:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2194 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86361 and previous config saved to /var/cache/conftool/dbconfig/20251203-111910-marostegui.json
  • 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 11:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86360 and previous config saved to /var/cache/conftool/dbconfig/20251203-111846-marostegui.json
  • 11:15 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
  • 11:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
  • 11:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1013
  • 11:07 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1013
  • 11:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P86359 and previous config saved to /var/cache/conftool/dbconfig/20251203-110338-marostegui.json
  • 10:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2001
  • 10:53 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2001
  • 10:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P86358 and previous config saved to /var/cache/conftool/dbconfig/20251203-104830-marostegui.json
  • 10:35 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptchaEditAttempt logging: Normalize line endings (T411578), hCaptchaEditAttempt logging: Normalize line endings (T411578) (duration: 07m 56s)
  • 10:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86357 and previous config saved to /var/cache/conftool/dbconfig/20251203-103323-marostegui.json
  • 10:30 kharlan@deploy2002: kharlan: Continuing with sync
  • 10:29 kharlan@deploy2002: kharlan: Backport for hCaptchaEditAttempt logging: Normalize line endings (T411578), hCaptchaEditAttempt logging: Normalize line endings (T411578) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 10:27 kharlan@deploy2002: Started scap sync-world: Backport for hCaptchaEditAttempt logging: Normalize line endings (T411578), hCaptchaEditAttempt logging: Normalize line endings (T411578)
  • 09:19 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.5 refs T408275
  • 09:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:14 ayounsi@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:00 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on ganeti-test2001.codfw.wmnet with reason: test CR1207804
  • 08:37 moritzm: upgrade Envoy on schema* T405808
  • 08:32 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99) for Hadoop analytics cluster
  • 08:13 moritzm: installing python-zipp security updates
  • 07:47 moritzm: installing libtpms security updates
  • 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1169 gradually with 4 steps - Repooling db1169
  • 07:12 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 07:05 moritzm: installing mako security updates
  • 07:01 Amir1: ladsgroup@deploy2002:~$ mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 11 rememberpassword (T406724)
  • 06:56 Amir1: ladsgroup@deploy2002:~$ mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 11 popups (T406724)
  • 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1169 gradually with 4 steps - Repooling db1169
  • 06:39 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) db1169 gradually with 4 steps - Repooling db1169
  • 06:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2224 (T410589)', diff saved to https://phabricator.wikimedia.org/P86350 and previous config saved to /var/cache/conftool/dbconfig/20251203-063812-ladsgroup.json
  • 06:38 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
  • 06:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T410589)', diff saved to https://phabricator.wikimedia.org/P86349 and previous config saved to /var/cache/conftool/dbconfig/20251203-063749-ladsgroup.json
  • 06:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1169 gradually with 4 steps - Repooling db1169
  • 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1169 - Depooling db1169
  • 06:29 marostegui@cumin1003: START - Cookbook sre.mysql.depool db1169 - Depooling db1169
  • 06:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1169.eqiad.wmnet with OS trixie
  • 06:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P86348 and previous config saved to /var/cache/conftool/dbconfig/20251203-062241-ladsgroup.json
  • 06:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
  • 06:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
  • 06:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
  • 06:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
  • 06:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P86345 and previous config saved to /var/cache/conftool/dbconfig/20251203-060734-ladsgroup.json
  • 06:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
  • 05:58 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
  • 05:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T410589)', diff saved to https://phabricator.wikimedia.org/P86344 and previous config saved to /var/cache/conftool/dbconfig/20251203-055226-ladsgroup.json
  • 05:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86343 and previous config saved to /var/cache/conftool/dbconfig/20251203-054438-marostegui.json
  • 05:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 05:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86342 and previous config saved to /var/cache/conftool/dbconfig/20251203-054414-marostegui.json
  • 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS trixie
  • 05:36 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1011.eqiad.wmnet with OS trixie
  • 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P86341 and previous config saved to /var/cache/conftool/dbconfig/20251203-052906-marostegui.json
  • 05:27 marostegui: Drop sockpuppet database T411527
  • 05:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P86340 and previous config saved to /var/cache/conftool/dbconfig/20251203-051359-marostegui.json
  • 04:59 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1011.eqiad.wmnet with reason: host reimage
  • 04:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86339 and previous config saved to /var/cache/conftool/dbconfig/20251203-045851-marostegui.json
  • 04:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1223.eqiad.wmnet with reason: Maintenance
  • 04:55 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1011.eqiad.wmnet with reason: host reimage
  • 04:34 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1011.eqiad.wmnet with OS trixie
  • 04:26 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1007.eqiad.wmnet with OS trixie
  • 03:50 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1007.eqiad.wmnet with reason: host reimage
  • 03:46 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1007.eqiad.wmnet with reason: host reimage
  • 03:30 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1007.eqiad.wmnet with OS trixie
  • 03:26 krinkle@deploy2002: Finished scap sync-world: Backport for robots.php: Avoid "404 Not Found" for Sitemap rule (T400023) (duration: 11m 08s)
  • 03:22 krinkle@deploy2002: krinkle: Continuing with sync
  • 03:17 krinkle@deploy2002: krinkle: Backport for robots.php: Avoid "404 Not Found" for Sitemap rule (T400023) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 03:15 krinkle@deploy2002: Started scap sync-world: Backport for robots.php: Avoid "404 Not Found" for Sitemap rule (T400023)
  • 03:08 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1006.eqiad.wmnet with OS trixie
  • 03:08 krinkle@deploy2002: Finished scap sync-world: Backport for robots.php: Clean up unused site, lang, and x-subdomain (T407122), Submit Commons sitemap to Bing/DuckDuckGo and remaining wikis to Google (T400023), robots.txt: Clean up inline comments, robots.txt: Remove redundant "/wiki/Fundraising_2007/comments" disallow (duration: 08m 26s)
  • 03:03 krinkle@deploy2002: krinkle: Continuing with sync
  • 03:02 krinkle@deploy2002: krinkle: Backport for robots.php: Clean up unused site, lang, and x-subdomain (T407122), Submit Commons sitemap to Bing/DuckDuckGo and remaining wikis to Google (T400023), robots.txt: Clean up inline comments, robots.txt: Remove redundant "/wiki/Fundraising_2007/comments" disallow synced to the testservers (see https://wiki
  • 02:59 krinkle@deploy2002: Started scap sync-world: Backport for robots.php: Clean up unused site, lang, and x-subdomain (T407122), Submit Commons sitemap to Bing/DuckDuckGo and remaining wikis to Google (T400023), robots.txt: Clean up inline comments, robots.txt: Remove redundant "/wiki/Fundraising_2007/comments" disallow
  • 02:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1006.eqiad.wmnet with reason: host reimage
  • 02:27 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1006.eqiad.wmnet with reason: host reimage
  • 02:13 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1006.eqiad.wmnet with OS trixie
  • 02:05 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1006.eqiad.wmnet with OS trixie
  • 01:50 eileen: civicrm upgraded from ef0b2676 to c6d1f24b
  • 01:23 ryankemper@cumin2002: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
  • 01:21 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99) for Hadoop analytics cluster
  • 01:18 ryankemper@cumin2002: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
  • 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 30s)
  • 01:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 00:50 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1006.eqiad.wmnet with OS trixie
  • 00:33 zabe@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.5 refs T408275
  • 00:24 zabe@deploy2002: Finished scap sync-world: Backport for Close klwiki (T411501) (duration: 07m 29s)
  • 00:20 zabe@deploy2002: zabe: Continuing with sync
  • 00:19 zabe@deploy2002: zabe: Backport for Close klwiki (T411501) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 00:17 zabe@deploy2002: Started scap sync-world: Backport for Close klwiki (T411501)
  • 00:09 zabe@deploy2002: Finished scap sync-world: Backport for Close crwiki (T411501) (duration: 07m 59s)
  • 00:05 zabe@deploy2002: zabe: Continuing with sync
  • 00:04 zabe@deploy2002: zabe: Backport for Close crwiki (T411501) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 00:01 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2217 (T410589)', diff saved to https://phabricator.wikimedia.org/P86338 and previous config saved to /var/cache/conftool/dbconfig/20251203-000140-ladsgroup.json
  • 00:01 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 00:01 zabe@deploy2002: Started scap sync-world: Backport for Close crwiki (T411501)

2025-12-02

  • 23:43 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86337 and previous config saved to /var/cache/conftool/dbconfig/20251202-234356-marostegui.json
  • 23:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 23:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86336 and previous config saved to /var/cache/conftool/dbconfig/20251202-234332-marostegui.json
  • 23:41 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1002.eqiad.wmnet with OS trixie
  • 23:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P86335 and previous config saved to /var/cache/conftool/dbconfig/20251202-232824-marostegui.json
  • 23:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1002.eqiad.wmnet with reason: host reimage
  • 23:23 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:23 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: move IPv6 gerrit-lb to IPs ending in ::2 T365259 - dzahn@cumin2002"
  • 23:22 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: move IPv6 gerrit-lb to IPs ending in ::2 T365259 - dzahn@cumin2002"
  • 23:17 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1002.eqiad.wmnet with reason: host reimage
  • 23:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P86334 and previous config saved to /var/cache/conftool/dbconfig/20251202-231317-marostegui.json
  • 23:09 eileen: civicrm upgraded from 8d8400e1 to ef0b2676
  • 23:02 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1001.eqiad.wmnet with OS trixie
  • 23:01 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudrabbit1002.eqiad.wmnet with OS trixie
  • 23:00 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1003.eqiad.wmnet with OS trixie
  • 22:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86333 and previous config saved to /var/cache/conftool/dbconfig/20251202-225809-marostegui.json
  • 22:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86332 and previous config saved to /var/cache/conftool/dbconfig/20251202-225122-marostegui.json
  • 22:45 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1001.eqiad.wmnet with reason: host reimage
  • 22:42 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
  • 22:41 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 22:39 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1001.eqiad.wmnet with reason: host reimage
  • 22:38 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
  • 22:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P86331 and previous config saved to /var/cache/conftool/dbconfig/20251202-223615-marostegui.json
  • 22:33 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 22:32 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 22:25 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudrabbit1001.eqiad.wmnet with OS trixie
  • 22:23 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS trixie
  • 22:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P86330 and previous config saved to /var/cache/conftool/dbconfig/20251202-222107-marostegui.json
  • 22:20 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1061.eqiad.wmnet with OS trixie
  • 22:09 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1054.eqiad.wmnet with OS trixie
  • 22:09 catrope@deploy2002: Finished scap sync-world: Backport for CentralAuthUser: Add debugging information for T385310 (T385310), CentralAuthUser: Add debugging information for T385310 (T385310) (duration: 07m 29s)
  • 22:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86329 and previous config saved to /var/cache/conftool/dbconfig/20251202-220600-marostegui.json
  • 22:05 catrope@deploy2002: catrope, matmarex: Continuing with sync
  • 22:04 catrope@deploy2002: catrope, matmarex: Backport for CentralAuthUser: Add debugging information for T385310 (T385310), CentralAuthUser: Add debugging information for T385310 (T385310) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 22:01 catrope@deploy2002: Started scap sync-world: Backport for CentralAuthUser: Add debugging information for T385310 (T385310), CentralAuthUser: Add debugging information for T385310 (T385310)
  • 21:55 dani@deploy2002: Finished scap sync-world: Backport for [beta] Undeploy experiment for 2025 Global Readers Survey (T410696), Deploy 2025 Global Readers Survey (non-enwiki) (T410918) (duration: 10m 23s)
  • 21:51 dani@deploy2002: dani: Continuing with sync
  • 21:47 dani@deploy2002: dani: Backport for [beta] Undeploy experiment for 2025 Global Readers Survey (T410696), Deploy 2025 Global Readers Survey (non-enwiki) (T410918) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:44 dani@deploy2002: Started scap sync-world: Backport for [beta] Undeploy experiment for 2025 Global Readers Survey (T410696), Deploy 2025 Global Readers Survey (non-enwiki) (T410918)
  • 21:43 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 21:43 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 21:42 kgraessle@deploy2002: Finished scap sync-world: Backport for Enable revertrisk filters in thwiki (T409438) (duration: 10m 34s)
  • 21:38 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['backup2013']
  • 21:38 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2013']
  • 21:38 kgraessle@deploy2002: kgraessle: Continuing with sync
  • 21:36 kgraessle@deploy2002: kgraessle: Backport for Enable revertrisk filters in thwiki (T409438) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:34 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 21:34 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 21:33 bking@dns1004: END - running authdns-update
  • 21:32 bking@dns1004: START - running authdns-update
  • 21:31 kgraessle@deploy2002: Started scap sync-world: Backport for Enable revertrisk filters in thwiki (T409438)
  • 21:31 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 21:29 kharlan@deploy2002: Finished scap sync-world: Backport for Refactor: Move editing session ID logic into service (T406865), hCaptcha: Log diff when challenge is presented (T406865) (duration: 59m 06s)
  • 21:26 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1050.eqiad.wmnet with OS trixie
  • 21:20 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1061.eqiad.wmnet with reason: host reimage
  • 21:17 kharlan@deploy2002: kharlan: Continuing with sync
  • 21:16 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1061.eqiad.wmnet with reason: host reimage
  • 21:15 kharlan@deploy2002: kharlan: Backport for Refactor: Move editing session ID logic into service (T406865), hCaptcha: Log diff when challenge is presented (T406865) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:14 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:14 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP - IPv6 - for ulsfo and magru T365259 - dzahn@cumin2002"
  • 21:14 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP - IPv6 - for ulsfo and magru T365259 - dzahn@cumin2002"
  • 21:10 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 21:04 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:03 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP - IPv6 - for drmrs, eqsin and esams T365259 - dzahn@cumin2002"
  • 21:03 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP - IPv6 - for drmrs, eqsin and esams T365259 - dzahn@cumin2002"
  • 21:00 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1061.eqiad.wmnet with OS trixie
  • 21:00 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1046.eqiad.wmnet with OS trixie
  • 20:58 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 20:52 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:52 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP - IPv6 - for codfw and eqiad T365259 - dzahn@cumin2002"
  • 20:52 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP - IPv6 - for codfw and eqiad T365259 - dzahn@cumin2002"
  • 20:48 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1054.eqiad.wmnet with reason: host reimage
  • 20:48 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 20:44 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1054.eqiad.wmnet with reason: host reimage
  • 20:43 eileen: civicrm upgraded from c90bd037 to 8d8400e1
  • 20:38 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1050.eqiad.wmnet with reason: host reimage
  • 20:37 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:37 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP for magru and eqiad T365259 - dzahn@cumin2002"
  • 20:37 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP for magru and eqiad T365259 - dzahn@cumin2002"
  • 20:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1046.eqiad.wmnet with reason: host reimage
  • 20:33 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 20:31 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:31 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP for drmrs and eqsin T365259 - dzahn@cumin2002"
  • 20:31 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP for drmrs and eqsin T365259 - dzahn@cumin2002"
  • 20:30 kharlan@deploy2002: Started scap sync-world: Backport for Refactor: Move editing session ID logic into service (T406865), hCaptcha: Log diff when challenge is presented (T406865)
  • 20:29 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1050.eqiad.wmnet with reason: host reimage
  • 20:28 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1054.eqiad.wmnet with OS trixie
  • 20:26 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 20:26 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1046.eqiad.wmnet with reason: host reimage
  • 20:18 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1005.eqiad.wmnet with OS bookworm
  • 20:18 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:18 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP for esams and ulsfo T365259 - dzahn@cumin2002"
  • 20:18 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP for esams and ulsfo T365259 - dzahn@cumin2002"
  • 20:13 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 20:12 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1050.eqiad.wmnet with OS trixie
  • 20:09 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1046.eqiad.wmnet with OS trixie
  • 19:58 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1005.eqiad.wmnet with reason: host reimage
  • 19:53 jhathaway@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1005.eqiad.wmnet with reason: host reimage
  • 19:52 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:49 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 19:48 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:48 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb.codfw.wikimedia.org T365259 - dzahn@cumin2002"
  • 19:46 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb.codfw.wikimedia.org T365259 - dzahn@cumin2002"
  • 19:43 cstone: payments-wiki upgraded from 6d39e545 to eeadc2d8
  • 19:42 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 19:34 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
  • 19:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 19:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 19:07 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3010*} and A:liberica
  • 19:03 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3010*} and A:liberica
  • 18:53 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic1-eqiad (T352245)
  • 18:53 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic1-eqiad (T352245)
  • 18:52 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1001-dev.eqiad.wmnet with OS trixie
  • 18:47 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic2-eqiad (T352245)
  • 18:47 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic2-eqiad (T352245)
  • 18:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 18:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 18:41 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-eqiad (T352245)
  • 18:40 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-eqiad (T352245)
  • 18:36 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-eqiad (T352245)
  • 18:36 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad (T352245)
  • 18:32 Emperor: repool ms-fe2014 T410959
  • 18:27 swfrench@deploy2002: Unlocked for deployment [MediaWiki]: Hold deployments during etcd certificate change - T352245 (duration: 17m 35s)
  • 18:26 swfrench-wmf: restarted navtiming on webperf1003 - T352245
  • 18:23 swfrench-wmf: begin rolling restarts of eqiad-associated confds - T352245
  • 18:22 swfrench-wmf: migrating etcd to PKI certs on conf1007 - T352245
  • 18:19 swfrench-wmf: deleted EtcdReplicationDown silence (42a82757-2075-44fd-b057-ec9ed2afeb90) - T352245
  • 18:16 swfrench-wmf: manually transferred etcd replication source back to conf1009 - T352245
  • 18:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 18:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 18:12 swfrench-wmf: migrating etcd to PKI certs on conf1009 - T352245
  • 18:10 swfrench@deploy2002: Locking from deployment [MediaWiki]: Hold deployments during etcd certificate change - T352245
  • 18:08 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1005.eqiad.wmnet with OS bookworm
  • 18:06 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1208442 T407553 (duration: 06m 36s)
  • 18:04 swfrench-wmf: manually transferred codfw etcd replication source to conf1008 - T352245
  • 18:02 rzl@deploy2002: rzl: Continuing with sync
  • 18:01 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1208442 T407553 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:01 swfrench-wmf: silenced EtcdReplicationDown (42a82757-2075-44fd-b057-ec9ed2afeb90) - T352245
  • 18:00 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1208442 T407553
  • 17:48 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1005.eqiad.wmnet with reason: host reimage
  • 17:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86328 and previous config saved to /var/cache/conftool/dbconfig/20251202-174732-marostegui.json
  • 17:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: Maintenance
  • 17:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 17:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1251.eqiad.wmnet onto db1169.eqiad.wmnet
  • 17:43 jhathaway@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1005.eqiad.wmnet with reason: host reimage
  • 17:42 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86327 and previous config saved to /var/cache/conftool/dbconfig/20251202-174249-marostegui.json
  • 17:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 17:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86326 and previous config saved to /var/cache/conftool/dbconfig/20251202-174225-marostegui.json
  • 17:29 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudbackup1001-dev.eqiad.wmnet with reason: host reimage
  • 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P86325 and previous config saved to /var/cache/conftool/dbconfig/20251202-172717-marostegui.json
  • 17:24 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
  • 17:22 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudbackup1001-dev.eqiad.wmnet with reason: host reimage
  • 17:21 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 17:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T410589)', diff saved to https://phabricator.wikimedia.org/P86324 and previous config saved to /var/cache/conftool/dbconfig/20251202-172134-ladsgroup.json
  • 17:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P86323 and previous config saved to /var/cache/conftool/dbconfig/20251202-171210-marostegui.json
  • 17:10 jhathaway@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1005.eqiad.wmnet with OS bookworm
  • 17:09 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudbackup1001-dev.eqiad.wmnet with OS trixie
  • 17:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P86322 and previous config saved to /var/cache/conftool/dbconfig/20251202-170627-ladsgroup.json
  • 17:06 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic1-eqiad (T352245)
  • 17:05 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic1-eqiad (T352245)
  • 17:03 brett: import varnish-modules 0.20.0-2~deb13+wmf1 into trixie-wikimedia - T401832
  • 17:02 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic2-eqiad (T352245)
  • 17:01 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic2-eqiad (T352245)
  • 16:59 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-eqiad (T352245)
  • 16:58 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-eqiad (T352245)
  • 16:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86321 and previous config saved to /var/cache/conftool/dbconfig/20251202-165702-marostegui.json
  • 16:54 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
  • 16:53 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-eqiad (T352245)
  • 16:53 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad (T352245)
  • 16:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P86320 and previous config saved to /var/cache/conftool/dbconfig/20251202-165119-ladsgroup.json
  • 16:51 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
  • 16:44 ihurbain@deploy2002: Finished scap sync-world: Backport for Bump parsoid to v0.23.0-a7.1 on wmf.4 (T411238 T410960), Bump parsoid to v0.23.0-a7.1 on wmf.4 (T411238 T410960) (duration: 09m 21s)
  • 16:43 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
  • 16:43 inflatador: bking@wmf3062 restart WDQS codfw to resolve lag/possible deadlocks
  • 16:39 ihurbain@deploy2002: ihurbain: Continuing with sync
  • 16:39 jhathaway@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1005.eqiad.wmnet with OS bookworm
  • 16:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 16:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 16:37 ihurbain@deploy2002: ihurbain: Backport for Bump parsoid to v0.23.0-a7.1 on wmf.4 (T411238 T410960), Bump parsoid to v0.23.0-a7.1 on wmf.4 (T411238 T410960) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 16:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T410589)', diff saved to https://phabricator.wikimedia.org/P86319 and previous config saved to /var/cache/conftool/dbconfig/20251202-163612-ladsgroup.json
  • 16:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1251 gradually with 4 steps - Pool db1251.eqiad.wmnet in after cloning
  • 16:35 ihurbain@deploy2002: Started scap sync-world: Backport for Bump parsoid to v0.23.0-a7.1 on wmf.4 (T411238 T410960), Bump parsoid to v0.23.0-a7.1 on wmf.4 (T411238 T410960)
  • 16:30 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
  • 16:27 brett: import varnish 7.1.1-2~bpo13+wmf2 into trixie-wikimedia - T401832
  • 16:24 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
  • 16:23 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
  • 16:20 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
  • 16:19 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
  • 16:18 swfrench-wmf: restarted navtiming on webperf1003 - T352245
  • 16:14 swfrench-wmf: begin rolling restarts of eqiad-associated confds - T352245
  • 16:12 moritzm: installing nodejs security updates
  • 16:12 swfrench@deploy2002: Unlocked for deployment [MediaWiki]: Hold deployments during etcd certificate change - T352245 (duration: 03m 45s)
  • 16:12 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
  • 16:10 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
  • 16:08 swfrench@deploy2002: Locking from deployment [MediaWiki]: Hold deployments during etcd certificate change - T352245
  • 16:08 swfrench-wmf: migrating etcd to PKI certs on conf1008 - T352245
  • 16:08 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1005.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 16:02 moritzm: installing libsndfile security updates
  • 16:01 jhathaway@cumin1003: START - Cookbook sre.hosts.provision for host sretest1005.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 16:00 gehel: restarting wdqs@codfw - system overloaded
  • 15:58 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on sretest1005.eqiad.wmnet with reason: ipxe
  • 15:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1251 gradually with 4 steps - Pool db1251.eqiad.wmnet in after cloning
  • 15:48 moritzm: upgrade Envoy on Yarn T405808
  • 15:45 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1088.eqiad.wmnet with OS bullseye
  • 15:29 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
  • 15:26 mvernon@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
  • 15:13 moritzm: upgrade Envoy on Turnilo T405808
  • 15:12 mvernon@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS bullseye
  • 14:51 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:47 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] Enable Add Link for 3 wikis (T407818) (duration: 07m 46s)
  • 14:43 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 14:41 urbanecm@deploy2002: urbanecm: Backport for [Growth] Enable Add Link for 3 wikis (T407818) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1198 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86314 and previous config saved to /var/cache/conftool/dbconfig/20251202-144148-marostegui.json
  • 14:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86313 and previous config saved to /var/cache/conftool/dbconfig/20251202-144123-marostegui.json
  • 14:39 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] Enable Add Link for 3 wikis (T407818)
  • 14:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:30 derick@deploy2002: Finished scap sync-world: Backport for user: Mark users created with User::addToDatabase() as primary (T410652) (duration: 08m 34s)
  • 14:28 ayounsi@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P86312 and previous config saved to /var/cache/conftool/dbconfig/20251202-142616-marostegui.json
  • 14:26 derick@deploy2002: d3r1ck01, derick: Continuing with sync
  • 14:25 derick@deploy2002: d3r1ck01, derick: Backport for user: Mark users created with User::addToDatabase() as primary (T410652) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:21 derick@deploy2002: Started scap sync-world: Backport for user: Mark users created with User::addToDatabase() as primary (T410652)
  • 14:21 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:18 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Growth: Enable Revise Tone feature on pilot wikis (T409606) (duration: 13m 03s)
  • 14:14 ayounsi@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:13 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, migr: Continuing with sync
  • 14:12 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:11 ayounsi@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P86311 and previous config saved to /var/cache/conftool/dbconfig/20251202-141108-marostegui.json
  • 14:11 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on ganeti-test2001.codfw.wmnet with reason: test CR1207804
  • 14:10 jgleeson: payments-wiki upgraded from b405d6db to 6d39e545
  • 14:07 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, migr: Backport for Growth: Enable Revise Tone feature on pilot wikis (T409606) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:05 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Growth: Enable Revise Tone feature on pilot wikis (T409606)
  • 13:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1251 - Depool db1251.eqiad.wmnet to then clone it to db1169.eqiad.wmnet - marostegui@cumin1003
  • 13:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool db1251 - Depool db1251.eqiad.wmnet to then clone it to db1169.eqiad.wmnet - marostegui@cumin1003
  • 13:58 marostegui@cumin1003: START - Cookbook sre.mysql.clone of db1251.eqiad.wmnet onto db1169.eqiad.wmnet
  • 13:57 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
  • 13:56 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
  • 13:56 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
  • 13:56 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
  • 13:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86309 and previous config saved to /var/cache/conftool/dbconfig/20251202-135600-marostegui.json
  • 13:55 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 13:54 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1169.eqiad.wmnet with OS bookworm
  • 13:07 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
  • 13:07 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
  • 13:04 brouberol: running rebalancing of kafka-main-codfw with throttle of 30MB/s - T407185
  • 13:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
  • 12:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
  • 12:46 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2193 (T410589)', diff saved to https://phabricator.wikimedia.org/P86308 and previous config saved to /var/cache/conftool/dbconfig/20251202-124632-ladsgroup.json
  • 12:46 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 12:46 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T410589)', diff saved to https://phabricator.wikimedia.org/P86307 and previous config saved to /var/cache/conftool/dbconfig/20251202-124609-ladsgroup.json
  • 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS bookworm
  • 12:41 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1169.eqiad.wmnet with OS bookworm
  • 12:40 kharlan@deploy2002: Finished scap sync-world: Backport for SI: Skip successfuledit event for null edits (T410280), SI: Skip successfuledit event for null edits (T410280) (duration: 06m 39s)
  • 12:36 kharlan@deploy2002: kharlan: Continuing with sync
  • 12:35 kharlan@deploy2002: kharlan: Backport for SI: Skip successfuledit event for null edits (T410280), SI: Skip successfuledit event for null edits (T410280) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 12:33 kharlan@deploy2002: Started scap sync-world: Backport for SI: Skip successfuledit event for null edits (T410280), SI: Skip successfuledit event for null edits (T410280)
  • 12:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P86305 and previous config saved to /var/cache/conftool/dbconfig/20251202-123102-ladsgroup.json
  • 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS bookworm
  • 12:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P86304 and previous config saved to /var/cache/conftool/dbconfig/20251202-121554-ladsgroup.json
  • 12:04 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 12:04 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 12:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T410589)', diff saved to https://phabricator.wikimedia.org/P86303 and previous config saved to /var/cache/conftool/dbconfig/20251202-120046-ladsgroup.json
  • 11:57 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 11:56 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 11:44 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 11:44 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 11:41 kharlan@deploy2002: Finished scap sync-world: Backport for wgAutoConfirmCount: Raise value to 10 for frwiki, idwiki, trwiki (T411263) (duration: 08m 28s)
  • 11:37 Emperor: rebuild RAID on ms-fe2014 T410959
  • 11:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2149 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86302 and previous config saved to /var/cache/conftool/dbconfig/20251202-113625-marostegui.json
  • 11:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 11:35 kharlan@deploy2002: kharlan: Continuing with sync
  • 11:34 kharlan@deploy2002: kharlan: Backport for wgAutoConfirmCount: Raise value to 10 for frwiki, idwiki, trwiki (T411263) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 11:32 kharlan@deploy2002: Started scap sync-world: Backport for wgAutoConfirmCount: Raise value to 10 for frwiki, idwiki, trwiki (T411263)
  • 11:16 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Switch frwiki to 99.9% passive mode (T405586), hCaptcha: Enable hCaptcha editing in 100% passive mode on enwiki (T405586) (duration: 08m 55s)
  • 11:12 kharlan@deploy2002: kharlan: Continuing with sync
  • 11:10 kharlan@deploy2002: kharlan: Backport for hCaptcha: Switch frwiki to 99.9% passive mode (T405586), hCaptcha: Enable hCaptcha editing in 100% passive mode on enwiki (T405586) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 11:07 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Switch frwiki to 99.9% passive mode (T405586), hCaptcha: Enable hCaptcha editing in 100% passive mode on enwiki (T405586)
  • 10:51 moritzm: rebuild software raid following disk swap on bast2003 T410195
  • 10:41 bwojtowicz@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
  • 10:38 elukey: upgrade spicerack to 12.1.0 on all cumin hosts
  • 10:36 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host sretest1005.eqiad.wmnet
  • 10:36 kharlan@deploy2002: Finished scap sync-world: Backport for UserInfoCard: Hide activity graph when it's likely to be inaccurate (T400409) (duration: 10m 26s)
  • 10:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 10:33 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 10:33 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 10:32 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 10:32 kharlan@deploy2002: kharlan: Continuing with sync
  • 10:31 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
  • 10:29 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
  • 10:27 kharlan@deploy2002: kharlan: Backport for UserInfoCard: Hide activity graph when it's likely to be inaccurate (T400409) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 10:25 kharlan@deploy2002: Started scap sync-world: Backport for UserInfoCard: Hide activity graph when it's likely to be inaccurate (T400409)
  • 10:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2223 gradually with 4 steps - After switchover
  • 10:21 kharlan@deploy2002: Finished scap sync-world: Backport for Allow similar signals to be merged into an existing case (T410303) (duration: 07m 52s)
  • 10:17 kharlan@deploy2002: kharlan: Continuing with sync
  • 10:15 kharlan@deploy2002: kharlan: Backport for Allow similar signals to be merged into an existing case (T410303) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 10:13 kharlan@deploy2002: Started scap sync-world: Backport for Allow similar signals to be merged into an existing case (T410303)
  • 10:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 10:04 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 09:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 09:53 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2223 gradually with 4 steps - After switchover
  • 09:53 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) db2223 gradually with 4 steps - After switchover
  • 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2223 gradually with 4 steps - After switchover
  • 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 09:50 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 09:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86298 and previous config saved to /var/cache/conftool/dbconfig/20251202-094931-marostegui.json
  • 09:46 elukey: uploaded spicerack_12.1.0 to apt.wikimedia.org bullseye-wikimedia,bookworm-wikimedia
  • 09:43 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 09:43 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 09:43 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:42 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 09:41 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:41 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 09:38 moritzm: upgrade Envoy on parsoidtest/testreduce T405808
  • 09:09 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.5 refs T408275
  • 09:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1189 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86297 and previous config saved to /var/cache/conftool/dbconfig/20251202-090932-marostegui.json
  • 09:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 09:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86296 and previous config saved to /var/cache/conftool/dbconfig/20251202-090908-marostegui.json
  • 09:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2223 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86295 and previous config saved to /var/cache/conftool/dbconfig/20251202-090334-marostegui.json
  • 09:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2223.codfw.wmnet with reason: Maintenance
  • 09:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86294 and previous config saved to /var/cache/conftool/dbconfig/20251202-090321-marostegui.json
  • 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P86293 and previous config saved to /var/cache/conftool/dbconfig/20251202-085401-marostegui.json
  • 08:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P86292 and previous config saved to /var/cache/conftool/dbconfig/20251202-084813-marostegui.json
  • 08:40 gehel: restarting wdqs@codfw - system overloaded
  • 08:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P86291 and previous config saved to /var/cache/conftool/dbconfig/20251202-083853-marostegui.json
  • 08:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P86290 and previous config saved to /var/cache/conftool/dbconfig/20251202-083306-marostegui.json
  • 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86289 and previous config saved to /var/cache/conftool/dbconfig/20251202-082345-marostegui.json
  • 08:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86288 and previous config saved to /var/cache/conftool/dbconfig/20251202-081758-marostegui.json
  • 08:17 dcausse: closing the utc morning backport window
  • 08:14 dcausse@deploy2002: Finished scap sync-world: Backport for cirrus: enable georgian transliteration second try profile (T408737) (duration: 10m 00s)
  • 08:09 dcausse@deploy2002: dcausse: Continuing with sync
  • 08:06 dcausse@deploy2002: dcausse: Backport for cirrus: enable georgian transliteration second try profile (T408737) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:04 dcausse@deploy2002: Started scap sync-world: Backport for cirrus: enable georgian transliteration second try profile (T408737)
  • 07:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2223.codfw.wmnet with reason: Schema change
  • 07:35 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2180 (T410589)', diff saved to https://phabricator.wikimedia.org/P86287 and previous config saved to /var/cache/conftool/dbconfig/20251202-073553-ladsgroup.json
  • 07:35 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 07:35 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T410589)', diff saved to https://phabricator.wikimedia.org/P86286 and previous config saved to /var/cache/conftool/dbconfig/20251202-073530-ladsgroup.json
  • 07:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P86285 and previous config saved to /var/cache/conftool/dbconfig/20251202-072022-ladsgroup.json
  • 07:05 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P86284 and previous config saved to /var/cache/conftool/dbconfig/20251202-070514-ladsgroup.json
  • 06:50 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T410589)', diff saved to https://phabricator.wikimedia.org/P86283 and previous config saved to /var/cache/conftool/dbconfig/20251202-065007-ladsgroup.json
  • 06:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2228.codfw.wmnet with reason: Schema change
  • 05:59 kart_: Updated cxserver to 2025-12-02-041957-production + Yandex key removal from production config
  • 05:59 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:57 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:52 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:52 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:50 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:49 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 05:20 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2213 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86282 and previous config saved to /var/cache/conftool/dbconfig/20251202-052010-marostegui.json
  • 05:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
  • 05:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86281 and previous config saved to /var/cache/conftool/dbconfig/20251202-051947-marostegui.json
  • 05:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P86280 and previous config saved to /var/cache/conftool/dbconfig/20251202-050439-marostegui.json
  • 05:02 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.2 (duration: 02m 56s)
  • 04:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P86279 and previous config saved to /var/cache/conftool/dbconfig/20251202-044931-marostegui.json
  • 04:48 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.5 refs T408275 (duration: 44m 45s)
  • 04:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86278 and previous config saved to /var/cache/conftool/dbconfig/20251202-043424-marostegui.json
  • 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.5 refs T408275
  • 03:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86277 and previous config saved to /var/cache/conftool/dbconfig/20251202-035202-marostegui.json
  • 03:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 03:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86276 and previous config saved to /var/cache/conftool/dbconfig/20251202-035138-marostegui.json
  • 03:43 cstone: payments-wiki upgraded from c1b83aa2 to b405d6db
  • 03:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P86275 and previous config saved to /var/cache/conftool/dbconfig/20251202-033630-marostegui.json
  • 03:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P86274 and previous config saved to /var/cache/conftool/dbconfig/20251202-032122-marostegui.json
  • 03:15 mutante: vrts1003 - compressed /opt/znuny-6.5.16 and .17 to .tar.gz files - then deleted uncompressed versions - freeing about 700k inodes (T411452)
  • 03:14 mutante: vrts1003 - sudo -u otrs ./bin/otrs.Console.pl Maint::Cache::Delete (T411452)
  • 03:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86273 and previous config saved to /var/cache/conftool/dbconfig/20251202-030615-marostegui.json
  • 01:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86272 and previous config saved to /var/cache/conftool/dbconfig/20251202-013635-marostegui.json
  • 01:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 00:05 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2169 (T410589)', diff saved to https://phabricator.wikimedia.org/P86271 and previous config saved to /var/cache/conftool/dbconfig/20251202-000540-ladsgroup.json
  • 00:05 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 00:05 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T410589)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20251202-000512-ladsgroup.json

2025-12-01

  • 23:50 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P86269 and previous config saved to /var/cache/conftool/dbconfig/20251201-235004-ladsgroup.json
  • 23:45 catrope@deploy2002: Finished scap sync-world: Backport for Make sure WebAuthnKey::$supportsPasswordless is always initialized (T411368) (duration: 07m 36s)
  • 23:41 catrope@deploy2002: catrope: Continuing with sync
  • 23:39 catrope@deploy2002: catrope: Backport for Make sure WebAuthnKey::$supportsPasswordless is always initialized (T411368) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:38 catrope@deploy2002: Started scap sync-world: Backport for Make sure WebAuthnKey::$supportsPasswordless is always initialized (T411368)
  • 23:34 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P86268 and previous config saved to /var/cache/conftool/dbconfig/20251201-233456-ladsgroup.json
  • 23:19 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T410589)', diff saved to https://phabricator.wikimedia.org/P86267 and previous config saved to /var/cache/conftool/dbconfig/20251201-231949-ladsgroup.json
  • 22:50 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
  • 22:40 logmsgbot: mstyles Deployed security patch for T411144
  • 22:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 22:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86266 and previous config saved to /var/cache/conftool/dbconfig/20251201-222810-marostegui.json
  • 22:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1166 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86265 and previous config saved to /var/cache/conftool/dbconfig/20251201-222607-marostegui.json
  • 22:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 22:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86264 and previous config saved to /var/cache/conftool/dbconfig/20251201-222544-marostegui.json
  • 22:20 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on zuul2002.codfw.wmnet with reason: reboot
  • 22:13 larssandergreen: civicrm upgraded from ee12d616 to c90bd037
  • 22:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P86263 and previous config saved to /var/cache/conftool/dbconfig/20251201-221302-marostegui.json
  • 22:11 dzahn@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host planet1004.eqiad.wmnet
  • 22:11 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host planet1004.eqiad.wmnet with OS trixie
  • 22:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P86262 and previous config saved to /var/cache/conftool/dbconfig/20251201-221036-marostegui.json
  • 21:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P86261 and previous config saved to /var/cache/conftool/dbconfig/20251201-215754-marostegui.json
  • 21:57 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on planet1004.eqiad.wmnet with reason: host reimage
  • 21:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P86260 and previous config saved to /var/cache/conftool/dbconfig/20251201-215529-marostegui.json
  • 21:52 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on planet1004.eqiad.wmnet with reason: host reimage
  • 21:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86259 and previous config saved to /var/cache/conftool/dbconfig/20251201-214247-marostegui.json
  • 21:42 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host planet1004.eqiad.wmnet with OS trixie
  • 21:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86258 and previous config saved to /var/cache/conftool/dbconfig/20251201-214021-marostegui.json
  • 21:37 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM planet1004.eqiad.wmnet - dzahn@cumin2002"
  • 21:37 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM planet1004.eqiad.wmnet - dzahn@cumin2002"
  • 21:36 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) planet1004.eqiad.wmnet on all recursors
  • 21:36 dzahn@cumin2002: START - Cookbook sre.dns.wipe-cache planet1004.eqiad.wmnet on all recursors
  • 21:36 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:36 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM planet1004.eqiad.wmnet - dzahn@cumin2002"
  • 21:36 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM planet1004.eqiad.wmnet - dzahn@cumin2002"
  • 21:36 bvibber@deploy2002: Finished scap sync-world: Backport for StickyHeaders: fix Minerva list styling for "peeking" bullet points (T409325) (duration: 07m 08s)
  • 21:32 bvibber@deploy2002: bvibber: Continuing with sync
  • 21:31 eileen: civicrm upgraded from 37ddffc2 to ee12d616
  • 21:31 bvibber@deploy2002: bvibber: Backport for StickyHeaders: fix Minerva list styling for "peeking" bullet points (T409325) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:29 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 21:29 dzahn@cumin2002: START - Cookbook sre.ganeti.makevm for new host planet1004.eqiad.wmnet
  • 21:29 bvibber@deploy2002: Started scap sync-world: Backport for StickyHeaders: fix Minerva list styling for "peeking" bullet points (T409325)
  • 21:25 cscott@deploy2002: Finished scap sync-world: Backport for Deploy Parsoid Read Views to 19 wikis (T411283), Change the README to Markdown, noc: Point links in /conf to Gitiles rather than Differential, REST: enable the site.v1 module (T409516), cirrus: Apply increased near match weight on commonswiki (T408154) (duration: 12m
  • 21:21 cscott@deploy2002: cscott, ebernhardson, tgr, arlolra, bpirkle: Continuing with sync
  • {{safesubst:SAL entry|1=21:17 cscott@deploy2002: cscott, ebernhardson, tgr, arlolra, bpirkle: Backport for Deploy Parsoid Read Views to 19 wikis (T411283), Change the README to Markdown, noc: Point links in /conf to Gitiles rather than Differential, REST: enable the site.v1 module (T409516), [[gerrit:1213559|cirrus: Apply increased near match weight on commonswiki (T408154}}
  • 21:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 21:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 21:13 cscott@deploy2002: Started scap sync-world: Backport for Deploy Parsoid Read Views to 19 wikis (T411283), Change the README to Markdown, noc: Point links in /conf to Gitiles rather than Differential, REST: enable the site.v1 module (T409516), cirrus: Apply increased near match weight on commonswiki (T408154)
  • 21:03 ejegg: payments-wiki upgraded from bb179e9c to c1b83aa2
  • 20:57 urbanecm@deploy2002: Finished scap sync-world: Backport for Introduce HTML confirmation email (T396155), ConfirmEmailHooks: Do not run when UserEmailConfirmationUseHTML is true (T396155) (duration: 36m 09s)
  • 20:51 herron: prometheus100[78] grow /dev/vg0/prometheus-k8s-dse filesystems
  • 20:44 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 20:44 urbanecm@deploy2002: urbanecm: Backport for Introduce HTML confirmation email (T396155), ConfirmEmailHooks: Do not run when UserEmailConfirmationUseHTML is true (T396155) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:37 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
  • 20:26 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
  • 20:20 urbanecm@deploy2002: Started scap sync-world: Backport for Introduce HTML confirmation email (T396155), ConfirmEmailHooks: Do not run when UserEmailConfirmationUseHTML is true (T396155)
  • 20:13 jhathaway@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on sretest2001.codfw.wmnet with reason: T383173
  • 20:10 taavi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-eqiad
  • 20:09 taavi@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-eqiad
  • 20:08 taavi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-eqiad
  • 20:08 mutante: upgrading envoyproxy on contint1002; phab1004; T405808
  • 20:04 taavi@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad
  • 20:04 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2178 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86256 and previous config saved to /var/cache/conftool/dbconfig/20251201-200359-marostegui.json
  • 20:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 20:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86255 and previous config saved to /var/cache/conftool/dbconfig/20251201-200335-marostegui.json
  • 20:02 mutante: updating envoyproxy from 1.29.x to 1.32.x on phabricator prod host
  • 19:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs6003*} and A:liberica
  • 19:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P86254 and previous config saved to /var/cache/conftool/dbconfig/20251201-194828-marostegui.json
  • 19:46 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6003*} and A:liberica
  • 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P86253 and previous config saved to /var/cache/conftool/dbconfig/20251201-193320-marostegui.json
  • 19:28 cdobbins@cumin2002: END (FAIL) - Cookbook sre.loadbalancer.admin (exit_code=1) rebooting P{lvs6003*} and A:liberica
  • 19:25 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6003*} and A:liberica
  • 19:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86252 and previous config saved to /var/cache/conftool/dbconfig/20251201-191812-marostegui.json
  • 19:14 cdobbins@cumin2002: END (FAIL) - Cookbook sre.loadbalancer.admin (exit_code=1) rebooting P{lvs6003*} and A:liberica
  • 19:11 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6003*} and A:liberica
  • 19:03 cdobbins@cumin2002: END (FAIL) - Cookbook sre.loadbalancer.admin (exit_code=1) rebooting P{lvs6003*} and A:liberica
  • 19:00 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6003*} and A:liberica
  • 18:44 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudweb1003.wikimedia.org with OS trixie
  • 18:24 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudweb1003.wikimedia.org with reason: host reimage
  • 18:18 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudweb1003.wikimedia.org with reason: host reimage
  • 18:05 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudweb1003.wikimedia.org with OS trixie
  • 18:03 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 18:02 taavi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-eqiad
  • 18:01 taavi@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad
  • 18:00 taavi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-eqiad
  • 17:59 taavi@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-eqiad
  • 17:56 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 17:45 taavi@cumin1003: conftool action : set/pooled=no; selector: cluster=cloudweb,name=cloudweb1003.wikimedia.org
  • 17:43 taavi@cumin1003: conftool action : set/pooled=inactive; selector: cluster=cloudweb,name=cloudweb1003.wikimedia.org
  • 17:39 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudweb1003.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 17:39 bd808@deploy2002: Finished scap sync-world: Backport for labswiki: Enable sitenotice on mobile (T410702) (duration: 06m 49s)
  • 17:39 tappof: "thanos-store: set cutoff days to 1" reverted on titan2001 (4/4) T410152
  • 17:35 bd808@deploy2002: bd808: Continuing with sync
  • 17:34 bd808@deploy2002: bd808: Backport for labswiki: Enable sitenotice on mobile (T410702) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:32 bd808@deploy2002: Started scap sync-world: Backport for labswiki: Enable sitenotice on mobile (T410702)
  • 17:32 andrew@cumin2002: START - Cookbook sre.hosts.provision for host cloudweb1003.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 17:31 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudweb1004.wikimedia.org with OS trixie
  • 17:17 tappof: "thanos-store: set cutoff days to 1" reverted on titan2002 (3/4) T410152
  • 17:08 hnowlan@deploy2002: Finished deploy [restbase/deploy@19cb647]: Add new wikis to restbase T408352 T408344 (duration: 16m 16s)
  • 16:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86251 and previous config saved to /var/cache/conftool/dbconfig/20251201-165902-marostegui.json
  • 16:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 16:58 cdobbins@cumin2002: END (FAIL) - Cookbook sre.loadbalancer.admin (exit_code=1) rebooting P{lvs6003*} and A:liberica
  • 16:55 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6003*} and A:liberica
  • 16:52 hnowlan@deploy2002: Started deploy [restbase/deploy@19cb647]: Add new wikis to restbase T408352 T408344
  • 16:48 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudweb1004.wikimedia.org with reason: host reimage
  • 16:43 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudweb1004.wikimedia.org with reason: host reimage
  • 16:31 Emperor: depool ms-fe2014 for disk swap T410959
  • 16:31 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudweb1004.wikimedia.org with OS trixie
  • 16:30 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudweb1004.wikimedia.org with OS trixie
  • 16:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2158 (T410589)', diff saved to https://phabricator.wikimedia.org/P86250 and previous config saved to /var/cache/conftool/dbconfig/20251201-162923-ladsgroup.json
  • 16:29 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 16:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T410589)', diff saved to https://phabricator.wikimedia.org/P86249 and previous config saved to /var/cache/conftool/dbconfig/20251201-162900-ladsgroup.json
  • 16:28 tappof: "thanos-store: set cutoff days to 1" reverted on titan1002 (2/4) T410152
  • 16:20 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1187 gradually with 4 steps - After schema change
  • 16:13 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P86247 and previous config saved to /var/cache/conftool/dbconfig/20251201-161352-ladsgroup.json
  • 16:00 taavi@dns1004: END - running authdns-update
  • 15:59 taavi@dns1004: START - running authdns-update
  • 15:58 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P86245 and previous config saved to /var/cache/conftool/dbconfig/20251201-155844-ladsgroup.json
  • 15:56 tappof: "thanos-store: set cutoff days to 1" reverted on titan1001 (1/4) T410152
  • 15:56 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudweb1004.wikimedia.org with OS trixie
  • 15:56 tappof: "thanos-store: set cutoff days to 1" reverted on titan1001 (1/4)
  • 15:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2171 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86244 and previous config saved to /var/cache/conftool/dbconfig/20251201-155606-marostegui.json
  • 15:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 15:55 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudweb1004.wikimedia.org with OS trixie
  • 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86243 and previous config saved to /var/cache/conftool/dbconfig/20251201-155542-marostegui.json
  • 15:50 inflatador: bking@wmf3062 restart wdqs codfw for high lag https://docs.google.com/spreadsheets/d/1UaabYlqj37EEaLAkrRArn4yNuNviGObgsGTfquIIHAQ/edit?gid=0#gid=0
  • 15:50 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudweb1004.wikimedia.org with OS trixie
  • 15:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1032.eqiad.wmnet with OS bookworm
  • 15:43 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T410589)', diff saved to https://phabricator.wikimedia.org/P86241 and previous config saved to /var/cache/conftool/dbconfig/20251201-154337-ladsgroup.json
  • 15:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P86240 and previous config saved to /var/cache/conftool/dbconfig/20251201-154035-marostegui.json
  • 15:34 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1187 gradually with 4 steps - After schema change
  • 15:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P86238 and previous config saved to /var/cache/conftool/dbconfig/20251201-152527-marostegui.json
  • 15:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
  • 15:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 15:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 15:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
  • 15:19 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp2043.codfw.wmnet with OS trixie
  • 15:15 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:12 kharlan@deploy2002: Finished scap sync-world: Backport for EventLogging: Register mediawiki.hcaptcha.edit stream (T406865), Set new $wgRateLimits config for edit attempt log (T406865) (duration: 11m 03s)
  • 15:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86237 and previous config saved to /var/cache/conftool/dbconfig/20251201-151019-marostegui.json
  • 15:07 kharlan@deploy2002: kharlan, sguebo: Continuing with sync
  • 15:03 kharlan@deploy2002: kharlan, sguebo: Backport for EventLogging: Register mediawiki.hcaptcha.edit stream (T406865), Set new $wgRateLimits config for edit attempt log (T406865) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudweb1004.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:01 kharlan@deploy2002: Started scap sync-world: Backport for EventLogging: Register mediawiki.hcaptcha.edit stream (T406865), Set new $wgRateLimits config for edit attempt log (T406865)
  • 14:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS bookworm
  • 14:55 andrew@cumin2002: START - Cookbook sre.hosts.provision for host cloudweb1004.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:54 esanders@deploy2002: Finished scap sync-world: Backport for FlowMoveBoardsToSubpages: Add 'title' option for moving a specific board (T402552) (duration: 06m 31s)
  • 14:50 esanders@deploy2002: esanders: Continuing with sync
  • 14:49 esanders@deploy2002: esanders: Backport for FlowMoveBoardsToSubpages: Add 'title' option for moving a specific board (T402552) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:47 esanders@deploy2002: Started scap sync-world: Backport for FlowMoveBoardsToSubpages: Add 'title' option for moving a specific board (T402552)
  • 14:46 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for CentralAuthUser: Cache getLocalGroups() (T410878) (duration: 14m 51s)
  • 14:42 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, matmarex: Continuing with sync
  • 14:37 slyngshede@dns1004: END - running authdns-update
  • 14:36 slyngshede@dns1004: START - running authdns-update
  • 14:33 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, matmarex: Backport for CentralAuthUser: Cache getLocalGroups() (T410878) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:31 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for CentralAuthUser: Cache getLocalGroups() (T410878)
  • 14:30 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Api: Initialise reference variable (T411075) (duration: 07m 04s)
  • 14:28 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS trixie
  • 14:26 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, matmarex: Continuing with sync
  • 14:25 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, matmarex: Backport for Api: Initialise reference variable (T411075) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:23 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Api: Initialise reference variable (T411075)
  • 14:17 mfossati@deploy2002: Finished scap sync-world: Backport for ReaderExperiments' StickyHeaders stream configuration (T410533) (duration: 11m 51s)
  • 14:11 mfossati@deploy2002: mfossati: Continuing with sync
  • 14:09 mfossati@deploy2002: mfossati: Backport for ReaderExperiments' StickyHeaders stream configuration (T410533) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:05 mfossati@deploy2002: Started scap sync-world: Backport for ReaderExperiments' StickyHeaders stream configuration (T410533)
  • 13:43 dcausse: T408431: reindexing all wikis in codfw
  • 13:42 moritzm: upgrade Envoy on deployment servers T405808
  • 13:16 moritzm: imported rancid 3.13-2+wmf12u1 for bookworm-wikimedia and 3.14-1+wmf13u1 for trixie-wikimedia T410606
  • 12:58 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1013
  • 12:53 elukey@cumin2002: START - Cookbook sre.hosts.powercycle for host ml-serve1013
  • 12:47 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1013.eqiad.wmnet with OS trixie
  • 11:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 11:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1013.eqiad.wmnet with reason: host reimage
  • 11:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86236 and previous config saved to /var/cache/conftool/dbconfig/20251201-114902-marostegui.json
  • 11:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 11:47 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1013.eqiad.wmnet with reason: host reimage
  • 11:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1013.eqiad.wmnet with OS trixie
  • 11:29 btullis: restarting envoyproxy process on cephosd100[1-5] for T405808
  • 11:28 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-serve1013.eqiad.wmnet with OS trixie
  • 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet
  • 11:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1010.eqiad.wmnet
  • 11:02 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1013.eqiad.wmnet with OS trixie
  • 10:52 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1013
  • 10:51 JavierMonton: Deployed refinery using scap, then deployed onto hdfs
  • 10:47 moritzm: upgrade Envoy on matomo1001 T405808
  • 10:47 elukey@cumin2002: START - Cookbook sre.hosts.powercycle for host ml-serve1013
  • 10:46 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 10:46 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 10:42 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-serve1013.eqiad.wmnet with OS trixie
  • 10:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1013.eqiad.wmnet with OS trixie
  • 10:39 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-serve1013.eqiad.wmnet with OS trixie
  • 10:23 javiermonton@deploy2002: Finished deploy [analytics/refinery@fa63f82]: Regular analytics train [analytics/refinery@fa63f82e] (duration: 00m 28s)
  • 10:23 javiermonton@deploy2002: Started deploy [analytics/refinery@fa63f82]: Regular analytics train [analytics/refinery@fa63f82e]
  • 10:20 a-pizzata@deploy2002: Finished deploy [analytics/refinery@fa63f82]: Regular analytics train [analytics/refinery@fa63f82e] (duration: 02m 54s)
  • 10:17 a-pizzata@deploy2002: Started deploy [analytics/refinery@fa63f82]: Regular analytics train [analytics/refinery@fa63f82e]
  • 10:16 a-pizzata@deploy2002: Finished deploy [analytics/refinery@fa63f82] (hadoop-test): Analytics train TEST [analytics/refinery@fa63f82e] (duration: 01m 08s)
  • 10:15 a-pizzata@deploy2002: Started deploy [analytics/refinery@fa63f82] (hadoop-test): Analytics train TEST [analytics/refinery@fa63f82e]
  • 10:14 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1013.eqiad.wmnet with OS trixie
  • 10:13 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
  • 10:13 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
  • 10:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
  • 10:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
  • 10:11 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:11 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change ml-serve1013 vlan - ayounsi@cumin1003"
  • 10:11 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change ml-serve1013 vlan - ayounsi@cumin1003"
  • 10:04 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
  • 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors (exit_code=0) rolling restart_daemons on A:logstash-collector
  • 09:53 taavi@dns1004: END - running authdns-update
  • 09:53 jmm@cumin2002: START - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors rolling restart_daemons on A:logstash-collector
  • 09:52 taavi@dns1004: START - running authdns-update
  • 09:39 moritzm: installing expat security updates
  • 09:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:58 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2151 (T410589)', diff saved to https://phabricator.wikimedia.org/P86235 and previous config saved to /var/cache/conftool/dbconfig/20251201-085828-ladsgroup.json
  • 08:58 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 08:50 moritzm: upgrade Envoy on config-master* T405808
  • 08:33 mszwarc@deploy2002: Finished scap sync-world: Backport for Fix mw-userlink class being added too broadly (T392775) (duration: 38m 35s)
  • 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:19 mszwarc@deploy2002: mszwarc: Continuing with sync
  • 08:19 brouberol@dns1004: END - running authdns-update
  • 08:18 mszwarc@deploy2002: mszwarc: Backport for Fix mw-userlink class being added too broadly (T392775) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:18 brouberol@dns1004: START - running authdns-update
  • 07:55 mszwarc@deploy2002: Started scap sync-world: Backport for Fix mw-userlink class being added too broadly (T392775)
  • 06:47 eileen: civicrm upgraded from 1fc76c13 to 37ddffc2
  • 06:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 05:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2212.codfw.wmnet with reason: Maintenance
  • 05:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 04:53 eileen: civicrm upgraded from 6c200f91 to 1fc76c13
  • 03:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 03:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86234 and previous config saved to /var/cache/conftool/dbconfig/20251201-033910-marostegui.json
  • 03:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P86233 and previous config saved to /var/cache/conftool/dbconfig/20251201-032402-marostegui.json
  • 03:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P86232 and previous config saved to /var/cache/conftool/dbconfig/20251201-030855-marostegui.json
  • 02:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86231 and previous config saved to /var/cache/conftool/dbconfig/20251201-025347-marostegui.json
  • 01:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1230 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86230 and previous config saved to /var/cache/conftool/dbconfig/20251201-012716-marostegui.json
  • 01:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 12m 34s)
  • 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
  • 00:22 eileen: civicrm upgraded from 4437a5ef to 6c200f91


2000s

2010s

2020-2024

2025-present