Server Admin Log/Archive 100
Appearance
2025-12-31
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:22 wfan: civicrm upgraded from 03ff6ee3 to 9d26c426
2025-12-30
- 18:48 moritzm: restarted Tomcat on idp2005
- 18:40 jmm@dns1004: END - running authdns-update
- 18:39 jmm@dns1004: START - running authdns-update
- 10:20 jgleeson: payments-wiki upgraded from 81340350 to 857e80f2
- 09:04 volans@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'clear' for AS: 18734
- 09:04 volans@cumin1003: START - Cookbook sre.network.peering with action 'clear' for AS: 18734
- 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 08s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-29
- 01:19 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 18m 37s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-28
- 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 12m 48s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-27
- 01:24 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 24m 10s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-26
- 19:43 cgoubert@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
- 19:37 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
- 19:30 andrewbogott: test message
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-25
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-24
- 15:29 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
- 15:29 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: sync
- 15:26 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
- 15:26 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: sync
- 15:16 kamila@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
- 15:10 kamila@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
- 14:27 kamila@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-codfw
- 14:21 kamila@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-codfw
- 14:16 kamila@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
- 14:09 kamila@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
- 01:19 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 18m 56s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-23
- 23:30 eileen: civicrm upgraded from 9cba6b6d to 03ff6ee3
- 15:46 damilare: payments-wiki upgraded from 5c9a955f to 81340350
- 15:15 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1328.eqiad.wmnet with OS trixie
- 14:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1328.eqiad.wmnet with reason: host reimage
- 14:51 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1328.eqiad.wmnet with reason: host reimage
- 14:45 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1329.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 14:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1329.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 14:39 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1328.eqiad.wmnet with OS trixie
- 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1364.eqiad.wmnet with OS trixie
- 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
- 14:23 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
- 14:08 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1364.eqiad.wmnet with reason: host reimage
- 14:04 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1364.eqiad.wmnet with reason: host reimage
- 13:56 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
- 13:55 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
- 13:54 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 13:54 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 13:54 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 13:53 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 13:53 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1364.eqiad.wmnet with OS trixie
- 13:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1364.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 13:42 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1364.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 13:42 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1364
- 13:41 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1364
- 13:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:41 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1364 - vriley@cumin1003"
- 13:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1364 - vriley@cumin1003"
- 13:40 urbanecm@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
- 13:39 urbanecm@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
- 13:37 vriley@cumin1003: START - Cookbook sre.dns.netbox
- 13:15 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 13:14 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 13:14 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 13:14 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 13:14 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
- 13:14 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
- 13:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 13:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 12:43 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 12:43 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 12:43 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 12:42 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 12:42 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
- 12:41 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
- 12:27 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 12:27 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 11:57 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1363.eqiad.wmnet with OS trixie
- 11:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1361.eqiad.wmnet with OS trixie
- 11:16 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
- 11:15 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
- 11:12 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
- 11:12 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
- 11:11 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
- 11:09 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
- 11:09 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
- 11:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
- 10:59 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1361.eqiad.wmnet with reason: host reimage
- 10:53 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1361.eqiad.wmnet with reason: host reimage
- 10:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1361.eqiad.wmnet with OS trixie
- 10:37 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1363.eqiad.wmnet with OS trixie
- 09:37 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker1363.eqiad.wmnet with OS trixie
- 08:17 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1363.eqiad.wmnet with OS trixie
- 08:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1363.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 08:11 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Resquito out of all services on: 1 hosts
- 08:07 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Resquito out of all services on: 1 hosts
- 08:07 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Resquito out of all services on: 1 hosts
- 08:07 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Resquito out of all services on: 1 hosts
- 08:07 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Resquito out of all services on: 1 hosts
- 08:05 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1362.eqiad.wmnet with OS trixie
- 08:05 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
- 08:05 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
- 08:04 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Resquito out of all services on: 2444 hosts
- 08:03 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1363.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 08:03 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1363
- 08:02 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1363
- 08:02 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:02 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1363 - vriley@cumin1003"
- 08:02 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1363 - vriley@cumin1003"
- 07:58 vriley@cumin1003: START - Cookbook sre.dns.netbox
- 07:51 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:49 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1362.eqiad.wmnet with reason: host reimage
- 07:46 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1361
- 07:45 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1361
- 07:44 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:43 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1362.eqiad.wmnet with reason: host reimage
- 07:41 vriley@cumin1003: START - Cookbook sre.dns.netbox
- 07:39 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:37 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:36 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1362.eqiad.wmnet with OS trixie
- 07:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1362.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:30 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:23 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:22 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:22 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1362.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:20 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1362
- 07:19 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:19 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1362
- 07:18 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:18 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1362 - vriley@cumin1003"
- 07:18 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1362 - vriley@cumin1003"
- 07:17 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:15 vriley@cumin1003: START - Cookbook sre.dns.netbox
- 07:13 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:06 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1360.eqiad.wmnet with OS trixie
- 07:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
- 07:05 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1361.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 07:05 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1361
- 07:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
- 07:03 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1361
- 07:03 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:03 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1361 - vriley@cumin1003"
- 07:03 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1361 - vriley@cumin1003"
- 06:58 vriley@cumin1003: START - Cookbook sre.dns.netbox
- 06:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1360.eqiad.wmnet with reason: host reimage
- 06:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1360.eqiad.wmnet with reason: host reimage
- 06:34 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1360.eqiad.wmnet with OS trixie
- 06:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1360.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 06:16 vriley@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1360.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 06:13 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1360
- 06:12 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1360
- 06:11 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 06:11 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1360 - vriley@cumin1003"
- 06:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt wikikube-worker1360 - vriley@cumin1003"
- 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.newpool (exit_code=0) pc1013 gradually with 4 steps - test
- 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 06:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 06:10 marostegui@cumin1003: START - Cookbook sre.mysql.newpool pc1013 gradually with 4 steps - test
- 06:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.newdepool (exit_code=0) pc1013 - test
- 06:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.newdepool pc1013 - test
- 06:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
- 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts es2028.codfw.wmnet
- 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2028.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 06:00 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: es2028.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 05:56 marostegui@cumin1003: START - Cookbook sre.dns.netbox
- 05:50 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts es2028.codfw.wmnet
- 05:02 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.4 (duration: 02m 34s)
2025-12-22
- 22:14 jgleeson: civicrm upgraded from 110aeb6d to 9cba6b6d
- 21:20 eileen: civicrm upgraded from d678d34e to 110aeb6d
- 20:30 eileen: civicrm upgraded from 4eee8c62 to d678d34e
- 19:45 mforns@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
- 19:45 mforns@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
- 19:44 mforns@deploy2002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
- 19:44 mforns@deploy2002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
- 19:44 mforns@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
- 19:44 mforns@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
- 19:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1333.eqiad.wmnet with OS trixie
- 19:43 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:42 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:41 sbisson@deploy2002: Finished scap sync-world: Backport for Fix section loading on desktop (T413305) (duration: 20m 44s)
- 19:39 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1330.eqiad.wmnet with OS trixie
- 19:39 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:39 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:34 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1331.eqiad.wmnet with OS trixie
- 19:34 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:34 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:30 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1332.eqiad.wmnet with OS trixie
- 19:30 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:30 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:28 sbisson@deploy2002: sbisson: Continuing with sync
- 19:26 sbisson@deploy2002: sbisson: Backport for Fix section loading on desktop (T413305) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:26 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1333.eqiad.wmnet with reason: host reimage
- 19:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1329.eqiad.wmnet with OS trixie
- 19:23 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:22 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1330.eqiad.wmnet with reason: host reimage
- 19:20 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1334.eqiad.wmnet with OS trixie
- 19:20 sbisson@deploy2002: Started scap sync-world: Backport for Fix section loading on desktop (T413305)
- 19:20 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:20 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1331.eqiad.wmnet with reason: host reimage
- 19:14 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1332.eqiad.wmnet with reason: host reimage
- 19:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1329.eqiad.wmnet with reason: host reimage
- 19:03 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1334.eqiad.wmnet with reason: host reimage
- 19:03 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1333.eqiad.wmnet with reason: host reimage
- 19:03 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1332.eqiad.wmnet with reason: host reimage
- 19:01 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1331.eqiad.wmnet with reason: host reimage
- 19:01 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1330.eqiad.wmnet with reason: host reimage
- 18:59 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1329.eqiad.wmnet with reason: host reimage
- 18:58 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1334.eqiad.wmnet with reason: host reimage
- 18:52 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1333.eqiad.wmnet with OS trixie
- 18:52 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1332.eqiad.wmnet with OS trixie
- 18:50 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1331.eqiad.wmnet with OS trixie
- 18:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1328.eqiad.wmnet with OS trixie
- 18:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 18:49 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1330.eqiad.wmnet with OS trixie
- 18:49 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 18:49 sbisson@deploy2002: Started scap sync-world: Backport for Fix section loading on desktop (T413305)
- 18:48 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1329.eqiad.wmnet with OS trixie
- 18:47 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1334.eqiad.wmnet with OS trixie
- 18:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1328.eqiad.wmnet with reason: host reimage
- 18:28 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1328.eqiad.wmnet with reason: host reimage
- 18:00 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1328.eqiad.wmnet with OS trixie
- 17:50 sbisson@deploy2002: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.5,1.46.0-wmf.7,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/med
- 17:46 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1334.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:35 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1334.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:35 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1333.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:30 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1032.eqiad.wmnet with OS trixie
- 17:29 mforns@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
- 17:29 mforns@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
- 17:26 mforns@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
- 17:26 mforns@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
- 17:24 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1333.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1332.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:22 sbisson@deploy2002: Started scap sync-world: Backport for Fix section loading on desktop (T413305)
- 17:13 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1332.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:11 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1331.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 17:00 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1331.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
- 16:54 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1330.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
- 16:49 tappof: lvextend /dev/vg0/srv on titan1001, titan1002, titan2002. T410152
- 16:46 fabfur@dns1004: END - running authdns-update
- 16:45 fabfur@dns1004: START - running authdns-update
- 16:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1330.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1329.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:41 sbisson@deploy2002: Started scap sync-world: Backport for Fix section loading on desktop (T413305)
- 16:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS trixie
- 16:32 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1031.eqiad.wmnet with OS trixie
- 16:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1329.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:26 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1328.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:22 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1330.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:22 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1331.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:21 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1331.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:21 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1330.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1328.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
- 16:12 damilare: donorwiki upgraded from 14e22620 to 5c9a955f
- 16:09 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
- 16:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1334.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1333.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:08 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1334.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:08 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1333.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1332.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1331.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:07 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1332.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1331.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:06 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1330.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:06 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1329.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1330.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 16:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1329.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:59 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker1328.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:58 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1328.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:53 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:53 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt wikikube1328-34 servers - jclark@cumin1003"
- 15:53 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt wikikube1328-34 servers - jclark@cumin1003"
- 15:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1031.eqiad.wmnet with OS trixie
- 15:50 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1031.eqiad.wmnet with OS trixie
- 15:49 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 15:48 akosiaris@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 15:48 akosiaris@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 15:48 akosiaris@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 15:48 akosiaris@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 15:47 akosiaris@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 15:47 akosiaris@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 15:47 akosiaris@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 15:45 akosiaris@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 15:45 akosiaris@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 15:45 akosiaris@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 15:45 akosiaris@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 15:44 akosiaris: remove limits from kube-state-metrics in ml-serve-{eqiad,codfw} ml-staging-codfw dse-k8s-{eqiad,codfw} aux-k8s-{eqiad,codfw} kubernetes clusters. No point in resource limits for this workload, it's an important cluster component.
- 15:44 akosiaris@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 15:44 akosiaris@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 15:44 akosiaris@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 15:39 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 15:38 akosiaris: remove limits from kube-state-metrics in wikikube and wikikube-staging clusters, no point in resource limits this workload, it's an important cluster component
- 15:38 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 15:38 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 15:38 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 15:38 akosiaris@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 15:37 akosiaris@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 15:36 akosiaris@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 15:36 akosiaris@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 14:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
- 14:53 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
- 14:35 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1031.eqiad.wmnet with OS trixie
- 14:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1030.eqiad.wmnet with OS trixie
- 14:32 urandom: serveraction powercycle restbase2034 (down, unresponsive)
- 14:24 jgleeson: payments-wiki upgraded from 4d41d604 to 5c9a955f
- 14:18 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 14:18 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 14:15 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
- 14:11 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
- 13:53 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1030.eqiad.wmnet with OS trixie
- 13:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1198 gradually with 4 steps - repooling
- 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.newpool (exit_code=0) es2051 gradually with 4 steps - test T383674
- 12:47 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1198 gradually with 4 steps - repooling
- 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.newpool es2051 gradually with 4 steps - test T383674
- 12:21 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 11:57 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 11:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1029.eqiad.wmnet with OS trixie
- 11:43 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 11:38 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 11:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 11:14 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 11:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.newdepool (exit_code=0) es2051 - test T383674
- 11:10 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 11:10 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
- 11:04 fceratto@cumin1003: START - Cookbook sre.mysql.newdepool es2051 - test T383674
- 10:37 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:35 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:28 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:28 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:28 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:27 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:23 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:23 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:18 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:07 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:05 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:04 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 09:31 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 09:02 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:47 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:47 elukey@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:28 elukey@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 08:14 elukey@deploy2002: Finished deploy [docker-pkg/deploy@1664255]: (no justification provided) (duration: 00m 08s)
- 08:14 elukey@deploy2002: Started deploy [docker-pkg/deploy@1664255]: (no justification provided)
- 08:04 elukey@deploy2002: Finished deploy [docker-pkg/deploy@1664255]: (no justification provided) (duration: 00m 07s)
- 08:04 elukey@deploy2002: Started deploy [docker-pkg/deploy@1664255]: (no justification provided)
- 08:03 elukey@deploy2002: Finished deploy [docker-pkg/deploy@1664255]: (no justification provided) (duration: 00m 11s)
- 08:03 elukey@deploy2002: Started deploy [docker-pkg/deploy@1664255]: (no justification provided)
- 07:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2028.codfw.wmnet with OS trixie
- 07:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 07:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2028.codfw.wmnet with reason: host reimage
- 06:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-21
- 23:48 eileen: config revision changed from e478c565 to 8e95f98e
- 01:24 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 23m 45s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-20
- 15:24 dzahn@dns1004: END - running authdns-update
- 15:23 dzahn@dns1004: START - running authdns-update
- 01:23 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 22m 39s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-19
- 21:03 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:03 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:02 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:02 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:01 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:01 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:00 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:00 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:59 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:59 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:58 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:56 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2249.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 20:55 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2249
- 20:54 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2249
- 20:50 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:50 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2249 to codfw - jhancock@cumin1003"
- 20:50 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2249 to codfw - jhancock@cumin1003"
- 20:47 jhancock@cumin1003: START - Cookbook sre.dns.netbox
- 18:16 mforns@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
- 18:16 mforns@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
- 18:16 mforns@deploy2002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
- 18:15 mforns@deploy2002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
- 18:15 mforns@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
- 18:15 mforns@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
- 17:45 cscott@deploy2002: Finished scap sync-world: Backport for Ensure that user interface language is "used" by postprocessing pipeline (T413227) (duration: 09m 07s)
- 17:41 cscott@deploy2002: cscott: Continuing with sync
- 17:38 cscott@deploy2002: cscott: Backport for Ensure that user interface language is "used" by postprocessing pipeline (T413227) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:36 cscott@deploy2002: Started scap sync-world: Backport for Ensure that user interface language is "used" by postprocessing pipeline (T413227)
- 17:21 mforns@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
- 17:21 mforns@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
- 17:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 17:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 16:44 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1029.eqiad.wmnet with OS bookworm
- 16:18 ejegg: civicrm upgraded from 878d168c to 4eee8c62
- 16:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 16:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 15:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 15:41 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 15:21 elukey: restored the correct puppetserver1001's TLS certificate for puppet following https://phabricator.wikimedia.org/T405580#11214327
- 15:20 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS bookworm
- 15:07 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS bookworm
- 15:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS bookworm
- 13:50 mforns@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
- 13:50 mforns@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
- 13:50 mforns@deploy2002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
- 13:50 mforns@deploy2002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
- 13:48 mforns@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
- 13:47 mforns@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
- 13:41 jgleeson: payments-wiki upgraded from 14e22620 to 4d41d604
- 13:39 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
- 13:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 13:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 13:10 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 13:10 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 12:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 12:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 11:11 mforns@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
- 11:11 mforns@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
- 10:25 elukey@deploy2002: Finished deploy [docker-pkg/deploy@b6cc5ab]: (no justification provided) (duration: 00m 12s)
- 10:25 elukey@deploy2002: Started deploy [docker-pkg/deploy@b6cc5ab]: (no justification provided)
- 10:12 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 10:12 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 10:12 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 10:11 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 10:11 stran@deploy2002: Finished scap sync-world: Backport for Only show temp accounts on IP if temp accounts are known (T413139) (duration: 07m 37s)
- 10:07 stran@deploy2002: mszwarc, stran: Continuing with sync
- 10:05 stran@deploy2002: mszwarc, stran: Backport for Only show temp accounts on IP if temp accounts are known (T413139) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:03 stran@deploy2002: Started scap sync-world: Backport for Only show temp accounts on IP if temp accounts are known (T413139)
- 10:00 elukey@deploy2002: Finished deploy [docker-pkg/deploy@1769f71]: (no justification provided) (duration: 00m 44s)
- 09:59 elukey@deploy2002: Started deploy [docker-pkg/deploy@1769f71]: (no justification provided)
- 09:56 moritzm: installing Linux 5.10.247 on Bullseye hosts
- 09:21 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 09:21 jelto@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 08:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 138881
- 08:50 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 138881
- 08:44 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 8560
- 08:41 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 8560
- 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 05:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2218.codfw.wmnet with reason: Maintenance
- 02:53 ejegg: payments-wiki upgraded from 8a207d81 to 14e22620
- 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 04s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-18
- 22:24 logmsgbot: mstyles Deployed security patch for T384147
- 22:15 jhathaway: uploading corto 1.0.21
- 22:11 cwhite@deploy2002: Finished deploy [statsv/statsv@0751b0b]: T383563 (duration: 00m 10s)
- 22:11 cwhite@deploy2002: Started deploy [statsv/statsv@0751b0b]: T383563
- 21:51 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Deploy: Various UI improvements - swfrench@cumin2002"
- 21:51 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Various UI improvements - swfrench@cumin2002
- 21:50 swfrench@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Various UI improvements - swfrench@cumin2002
- 21:50 swfrench@cumin2002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Deploy: Various UI improvements - swfrench@cumin2002"
- 21:47 eileen: civicrm upgraded from 0560cfd9 to 878d168c
- 21:24 toyofuku@deploy2002: Finished scap sync-world: Backport for [Legal Footer] Deploy Legal Footer for Phase 1 wikis (T412455) (duration: 07m 04s)
- 21:19 toyofuku@deploy2002: toyofuku, lmora: Continuing with sync
- 21:19 toyofuku@deploy2002: toyofuku, lmora: Backport for [Legal Footer] Deploy Legal Footer for Phase 1 wikis (T412455) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:16 toyofuku@deploy2002: Started scap sync-world: Backport for [Legal Footer] Deploy Legal Footer for Phase 1 wikis (T412455)
- 21:13 tgr@deploy2002: Finished scap sync-world: Backport for Remove LoggedOut cookie logic (T142542), Turn on Parsoid Read Views on itwiki (T413084), Logos: Handle missing responsive URLs, manually modify thumbnail sizes to avoid $wgThumbnailSteps (T405169) (duration: 06m 28s)
- 21:09 tgr@deploy2002: pppery, tgr, cscott: Continuing with sync
- 21:08 tgr@deploy2002: pppery, tgr, cscott: Backport for Remove LoggedOut cookie logic (T142542), Turn on Parsoid Read Views on itwiki (T413084), Logos: Handle missing responsive URLs, manually modify thumbnail sizes to avoid $wgThumbnailSteps (T405169) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:06 tgr@deploy2002: Started scap sync-world: Backport for Remove LoggedOut cookie logic (T142542), Turn on Parsoid Read Views on itwiki (T413084), Logos: Handle missing responsive URLs, manually modify thumbnail sizes to avoid $wgThumbnailSteps (T405169)
- 20:49 dancy@deploy2002: Installation of scap version "4.230.0" completed for 1 hosts
- 20:48 dancy@deploy2002: Installing scap version "4.230.0" for 1 host(s)
- 20:47 dancy@deploy2002: Installation of scap version "4.230.0" completed for 2 hosts
- 20:45 dancy@deploy2002: Installing scap version "4.230.0" for 2 host(s)
- 19:10 dancy@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.7 refs T408277
- 18:46 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 18:46 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 18:46 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 18:45 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 18:45 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 18:45 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 18:42 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 18:42 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 18:42 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 18:42 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 18:42 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 18:42 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 18:41 ejegg: donorwiki upgraded from 99671dda to 14e22620
- 17:26 dreamyjazz@deploy2002: Finished scap sync-world: Backport for CheckUser: Set $wgCheckUserLogMaxRangeToShowInLog (T320769) (duration: 06m 46s)
- 17:22 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 17:22 dreamyjazz@deploy2002: dreamyjazz: Backport for CheckUser: Set $wgCheckUserLogMaxRangeToShowInLog (T320769) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:19 dreamyjazz@deploy2002: Started scap sync-world: Backport for CheckUser: Set $wgCheckUserLogMaxRangeToShowInLog (T320769)
- 16:43 elukey@deploy2002: Finished deploy [docker-pkg/deploy@a8e9cb3]: (no justification provided) (duration: 00m 15s)
- 16:42 elukey@deploy2002: Started deploy [docker-pkg/deploy@a8e9cb3]: (no justification provided)
- 16:27 elukey@deploy2002: Finished deploy [docker-pkg/deploy@a8e9cb3]: (no justification provided) (duration: 00m 12s)
- 16:27 elukey@deploy2002: Started deploy [docker-pkg/deploy@a8e9cb3]: (no justification provided)
- 15:05 Lucas_WMDE: UTC afternoon backport+config window done
- 15:03 cscott@deploy2002: Finished scap sync-world: Backport for Turn on Parsoid Read Views on nlwiki (T413084) (duration: 09m 12s)
- 14:58 cscott@deploy2002: cscott: Continuing with sync
- 14:56 cscott@deploy2002: cscott: Backport for Turn on Parsoid Read Views on nlwiki (T413084) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:54 cscott@deploy2002: Started scap sync-world: Backport for Turn on Parsoid Read Views on nlwiki (T413084)
- 14:50 sgimeno@deploy2002: Finished scap sync-world: Backport for UserImpact: stop using pre-computed impact in the user impact job (T398500) (duration: 09m 31s)
- 14:46 sgimeno@deploy2002: sgimeno: Continuing with sync
- 14:43 sgimeno@deploy2002: sgimeno: Backport for UserImpact: stop using pre-computed impact in the user impact job (T398500) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:41 sgimeno@deploy2002: Started scap sync-world: Backport for UserImpact: stop using pre-computed impact in the user impact job (T398500)
- 14:28 moritzm: installing rubygems security updates
- 14:25 derick@deploy2002: Finished scap sync-world: Backport for Rest: Add more debug logging for `Resource::getProfile()` (T409901), Rest: Add more debug logging for `Resource::getProfile()` (T409901) (duration: 06m 54s)
- 14:21 derick@deploy2002: d3r1ck01, derick: Continuing with sync
- 14:20 derick@deploy2002: d3r1ck01, derick: Backport for Rest: Add more debug logging for `Resource::getProfile()` (T409901), Rest: Add more debug logging for `Resource::getProfile()` (T409901) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:18 derick@deploy2002: Started scap sync-world: Backport for Rest: Add more debug logging for `Resource::getProfile()` (T409901), Rest: Add more debug logging for `Resource::getProfile()` (T409901)
- 14:14 stran@deploy2002: Finished scap sync-world: Backport for Revert^2 "Enable v2 non-emergency workflow by default" (duration: 08m 50s)
- 14:10 stran@deploy2002: stran: Continuing with sync
- 14:08 stran@deploy2002: stran: Backport for Revert^2 "Enable v2 non-emergency workflow by default" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:05 stran@deploy2002: Started scap sync-world: Backport for Revert^2 "Enable v2 non-emergency workflow by default"
- 14:02 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
- 14:01 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
- 14:01 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
- 14:00 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
- 14:00 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 13:59 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 13:43 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
- 13:14 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
- 13:14 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
- 13:14 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
- 13:13 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
- 13:13 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 13:12 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 12:53 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Joely Rooke WMDE out of all services on: 2435 hosts
- 12:41 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 11:22 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
- 11:21 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
- 09:36 Emperor: restart swift-container-sync on ms-be2081 T413008
- 08:56 ammarpad@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki 'Anurag Bhattamishra' 'Renamed user d198c4f693b15534f61d97349d9d7d8e' # T413036
- 07:55 moritzm: bounced slapd on serpens after cleaninp up a failed logrotate
- 05:45 musikanimal@deploy2002: Finished scap sync-world: Backport for codemirror.less: order the gutters (T412884), CodeMirror: disable spellcheck for non-wikitext (T412848), extension.json: make activeLine on by default for non-wikitext (T412886), CodeMirrorJavaScript: better descriptions for ESLint suggestions (duration: 12m 04s)
- 05:41 musikanimal@deploy2002: musikanimal: Continuing with sync
- 05:35 musikanimal@deploy2002: musikanimal: Backport for codemirror.less: order the gutters (T412884), CodeMirror: disable spellcheck for non-wikitext (T412848), extension.json: make activeLine on by default for non-wikitext (T412886), CodeMirrorJavaScript: better descriptions for ESLint suggestions synced to the testservers (see https://wikitech.wik
- 05:33 musikanimal@deploy2002: Started scap sync-world: Backport for codemirror.less: order the gutters (T412884), CodeMirror: disable spellcheck for non-wikitext (T412848), extension.json: make activeLine on by default for non-wikitext (T412886), CodeMirrorJavaScript: better descriptions for ESLint suggestions
- 04:03 eileen: civicrm upgraded from 12b8fa9d to 0560cfd9
- 02:54 musikanimal@deploy2002: Finished scap sync-world: Backport for Use CodeMirror instead of CodeEditor for beta feature users + vue mode (T373711) (duration: 07m 15s)
- 02:50 musikanimal@deploy2002: musikanimal: Continuing with sync
- 02:49 musikanimal@deploy2002: musikanimal: Backport for Use CodeMirror instead of CodeEditor for beta feature users + vue mode (T373711) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 02:47 musikanimal@deploy2002: Started scap sync-world: Backport for Use CodeMirror instead of CodeEditor for beta feature users + vue mode (T373711)
- 01:24 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 23m 23s)
- 01:01 rzl@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 01:01 rzl@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 01:01 rzl@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 01:00 rzl@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 00:56 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
- 00:56 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
- 00:56 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 00:55 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 00:55 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
- 00:54 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
- 00:54 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
- 00:54 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
- 00:54 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
- 00:53 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/toolhub: apply
- 00:53 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/termbox: apply
- 00:52 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/termbox: apply
- 00:50 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
- 00:50 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
- 00:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
- 00:49 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
- 00:48 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
- 00:48 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
- 00:48 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
- 00:48 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
- 00:47 cstone: civicrm upgraded from 28ef5eb1 to 12b8fa9d
- 00:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
- 00:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
- 00:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 00:45 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
- 00:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
- 00:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
- 00:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 00:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
- 00:43 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
- 00:43 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
- 00:42 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 00:40 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 00:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
- 00:37 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
- 00:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
- 00:19 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
- 00:19 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 00:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 00:18 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/image-suggestion: apply
- 00:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/image-suggestion: apply
- 00:17 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
- 00:17 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
- 00:17 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
- 00:15 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
- 00:15 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
- 00:14 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
- 00:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
- 00:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
- 00:13 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
- 00:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
- 00:12 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
- 00:12 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
- 00:12 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
- 00:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
- 00:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
- 00:10 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
- 00:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
- 00:10 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
- 00:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/echostore: apply
- 00:09 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/echostore: apply
- 00:08 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
- 00:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
- 00:08 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 00:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 00:07 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
- 00:07 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
- 00:07 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/commons-impact-analytics: apply
- 00:06 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/commons-impact-analytics: apply
- 00:05 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 00:05 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
- 00:05 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
- 00:04 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
- 00:04 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
- 00:03 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/apertium: apply
2025-12-17
- 23:52 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
- 23:52 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/zotero: apply
- 23:52 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 23:51 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 23:51 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
- 23:50 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
- 23:49 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
- 23:49 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
- 23:48 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
- 23:48 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
- 23:47 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 23:47 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 23:46 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
- 23:45 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/termbox: apply
- 23:45 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
- 23:45 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
- 23:44 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
- 23:44 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
- 23:43 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
- 23:43 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
- 23:43 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
- 23:43 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
- 23:42 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
- 23:42 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
- 23:42 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- 23:41 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
- 23:36 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
- 23:35 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
- 23:35 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 23:35 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
- 23:34 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
- 23:34 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
- 23:33 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 23:31 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 23:31 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
- 23:31 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
- 23:30 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
- 23:18 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
- 23:14 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
- 23:13 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
- 23:13 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 23:13 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 23:12 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/image-suggestion: apply
- 23:12 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/image-suggestion: apply
- 23:12 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
- 23:11 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
- 23:11 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
- 23:10 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
- 23:10 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
- 23:10 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
- 23:09 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
- 23:09 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
- 23:09 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
- 23:08 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
- 23:08 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
- 23:08 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
- 23:08 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
- 23:07 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
- 23:07 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
- 23:06 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
- 23:06 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
- 23:06 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
- 23:05 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
- 23:04 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/echostore: apply
- 23:04 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
- 23:04 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
- 23:04 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 23:03 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 23:03 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
- 23:03 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
- 23:02 eileen: config revision changed from 7d6ad875 to e478c565
- 22:59 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/commons-impact-analytics: apply
- 22:59 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/commons-impact-analytics: apply
- 22:58 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 22:58 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
- 22:57 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
- 22:56 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
- 22:55 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
- 22:55 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/apertium: apply
- 22:53 jhathaway: upload new version of corto
- 22:48 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 22:47 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 22:47 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
- 22:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
- 22:46 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 22:45 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 22:45 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
- 22:45 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
- 22:43 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 22:43 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 22:41 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 22:31 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 22:30 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 22:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 22:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
- 22:17 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
- 22:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
- 22:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
- 22:15 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 22:15 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 22:14 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
- 22:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
- 22:14 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
- 22:13 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
- 22:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
- 22:12 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
- 21:56 egardner@deploy2002: Finished scap sync-world: Backport for Delay StickyHeaders section click instrumentation for slow loads (T412857) (duration: 07m 47s)
- 21:52 egardner@deploy2002: egardner: Continuing with sync
- 21:50 egardner@deploy2002: egardner: Backport for Delay StickyHeaders section click instrumentation for slow loads (T412857) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:48 egardner@deploy2002: Started scap sync-world: Backport for Delay StickyHeaders section click instrumentation for slow loads (T412857)
- 21:36 cscott@deploy2002: Finished scap sync-world: Backport for Enable post-processing cache for all Parsoid-rendered wikis (T348255), Decommission Article Summaries (T411558) (duration: 12m 13s)
- 21:32 cscott@deploy2002: ksarabia, ihurbain, cscott: Continuing with sync
- 21:26 cscott@deploy2002: ksarabia, ihurbain, cscott: Backport for Enable post-processing cache for all Parsoid-rendered wikis (T348255), Decommission Article Summaries (T411558) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:23 cscott@deploy2002: Started scap sync-world: Backport for Enable post-processing cache for all Parsoid-rendered wikis (T348255), Decommission Article Summaries (T411558)
- 21:18 cscott@deploy2002: Finished scap sync-world: Backport for ParserOutputAccess: don't use PoolCounter recursively (T412959), ParserOutputAccess: don't use PoolCounter recursively (T412959) (duration: 08m 50s)
- 21:15 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2239.codfw.wmnet with reason: Maintenance
- 21:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T410589)', diff saved to https://phabricator.wikimedia.org/P86728 and previous config saved to /var/cache/conftool/dbconfig/20251217-211537-ladsgroup.json
- 21:14 cscott@deploy2002: cscott: Continuing with sync
- 21:11 cscott@deploy2002: cscott: Backport for ParserOutputAccess: don't use PoolCounter recursively (T412959), ParserOutputAccess: don't use PoolCounter recursively (T412959) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:09 cscott@deploy2002: Started scap sync-world: Backport for ParserOutputAccess: don't use PoolCounter recursively (T412959), ParserOutputAccess: don't use PoolCounter recursively (T412959)
- 21:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P86727 and previous config saved to /var/cache/conftool/dbconfig/20251217-210029-ladsgroup.json
- 20:52 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 20:52 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 20:51 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 20:51 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 20:51 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
- 20:50 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
- 20:50 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 20:50 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 20:49 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
- 20:49 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 20:49 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
- 20:49 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 20:49 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 20:48 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 20:48 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 20:47 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
- 20:45 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P86726 and previous config saved to /var/cache/conftool/dbconfig/20251217-204520-ladsgroup.json
- 20:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T410589)', diff saved to https://phabricator.wikimedia.org/P86725 and previous config saved to /var/cache/conftool/dbconfig/20251217-203012-ladsgroup.json
- 19:11 dancy@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.7 refs T408277
- 18:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1236.eqiad.wmnet with reason: Maintenance
- 18:48 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 18:48 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 18:47 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 18:46 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 18:42 swfrench@deploy2002: Finished scap sync-world: Rebuild deployment to pick up new production image (duration: 78m 01s)
- 18:32 cmooney@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2028.codfw.wmnet with OS trixie
- 17:54 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 17:51 topranks: upgrading OS on lswtest-d8-eqiad T412733
- 17:51 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-d[1,8]-eqiad with reason: upgradiing sr-linux on lswtest-d8-eqiad
- 17:50 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-d[1,8]-eqiad.mgmt with reason: upgradiing sr-linux on lswtest-d8-eqiad
- 17:46 cmooney@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2028.codfw.wmnet with OS trixie
- 17:34 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1006.eqiad.wmnet with reason: upgrading connected switch
- 17:33 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lswtest-d8-eqiad,lswtest-d8-eqiad IPv6 with reason: upgradiing sr-linux on lswtest-d8-eqiad
- 17:28 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host es2028
- 17:28 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host es2028
- 17:28 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 17:27 swfrench@deploy2002: Started scap sync-world: Rebuild deployment to pick up new production image
- 17:24 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
- 17:12 swfrench-wmf: reprepro include php8.3_8.3.28-1+wmf11u2 in component/php83
- 17:08 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7009.*
- 17:04 fabfur: enabling puppet and repooling cp7009 (T412785)
- 16:38 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1031.eqiad.wmnet
- 16:31 eevans@cumin1003: START - Cookbook sre.hosts.reboot-single for host restbase1031.eqiad.wmnet
- 15:50 eevans@cumin1003: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=restbase,service=restbase-ssl
- 15:50 eevans@cumin1003: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=restbase,service=restbase-https
- 15:49 eevans@cumin1003: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=restbase,service=restbase-backend
- 15:45 eevans@cumin1003: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=restbase,service=restbase-*
- 15:28 moritzm: upgrade Envoy on etherpad* T410975
- 15:12 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase1031.eqiad.wmnet on all recursors
- 15:12 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache restbase1031.eqiad.wmnet on all recursors
- 15:12 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:12 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA to restbase1031 - ayounsi@cumin1003"
- 15:11 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA to restbase1031 - ayounsi@cumin1003"
- 15:11 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host es2028
- 15:11 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host es2028
- 15:11 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 15:07 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 15:06 XioNoX: add AAAA record to restbase1031.eqiad.wmnet - T271140
- 15:05 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 15:05 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 15:04 Lucas_WMDE: UTC afternoon backport+config window done
- 15:03 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 15:03 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 15:01 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 15:01 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 14:59 moritzm: installing nodejs security updates
- 14:53 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 14:53 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 14:51 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Revert "Enable v2 non-emergency workflow by default" (T410512 T412715), Activate post-processing cache on some wikis (T348255) (duration: 18m 45s)
- 14:50 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 14:50 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 14:47 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, ihurbain: Continuing with sync
- 14:41 moritzm: installing tiff security updates
- 14:35 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, ihurbain: Backport for Revert "Enable v2 non-emergency workflow by default" (T410512 T412715), Activate post-processing cache on some wikis (T348255) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:33 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Revert "Enable v2 non-emergency workflow by default" (T410512 T412715), Activate post-processing cache on some wikis (T348255)
- 14:29 lucaswerkmeister-wmde@deploy2002: Sync cancelled.
- 14:22 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 14:22 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 14:19 lucaswerkmeister-wmde@deploy2002: stran, lucaswerkmeister-wmde: Backport for Enable v2 non-emergency workflow by default (T410512 T412715) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:18 moritzm: installing redis security updates
- 14:16 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Enable v2 non-emergency workflow by default (T410512 T412715)
- 14:11 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for lift throttle limits for Sing Lit 2025 (T412820) (duration: 07m 10s)
- 14:09 moritzm: installing pdns-recursor security updates
- 14:07 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, robertsky: Continuing with sync
- 14:06 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, robertsky: Backport for lift throttle limits for Sing Lit 2025 (T412820) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:04 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for lift throttle limits for Sing Lit 2025 (T412820)
- 13:53 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 13:52 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 13:47 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 13:45 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 13:44 moritzm: upgtrade Envoy on grafana* T410975
- 13:36 moritzm: installing apache2 security updates
- 13:32 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 13:29 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 13:27 moritzm: upgtrade Envoy on an-web T410975
- 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86724 and previous config saved to /var/cache/conftool/dbconfig/20251217-121556-marostegui.json
- 12:15 moritzm: installing pam security updates
- 12:07 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 12:06 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 12:04 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
- 12:04 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
- 12:02 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 12:01 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
- 12:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P86723 and previous config saved to /var/cache/conftool/dbconfig/20251217-120047-marostegui.json
- 11:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P86722 and previous config saved to /var/cache/conftool/dbconfig/20251217-114539-marostegui.json
- 11:42 moritzm: installing libsndfile security updates
- 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86721 and previous config saved to /var/cache/conftool/dbconfig/20251217-113031-marostegui.json
- 11:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86720 and previous config saved to /var/cache/conftool/dbconfig/20251217-112818-marostegui.json
- 11:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2222.codfw.wmnet with reason: Maintenance
- 11:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86719 and previous config saved to /var/cache/conftool/dbconfig/20251217-112805-marostegui.json
- 11:23 Amir1: dropped "trash" and "percona" databases in x1
- 11:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P86718 and previous config saved to /var/cache/conftool/dbconfig/20251217-111257-marostegui.json
- 11:04 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1029.eqiad.wmnet with OS trixie
- 10:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P86717 and previous config saved to /var/cache/conftool/dbconfig/20251217-105748-marostegui.json
- 10:51 moritzm: installing libssh security updates
- 10:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86716 and previous config saved to /var/cache/conftool/dbconfig/20251217-104240-marostegui.json
- 10:37 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: apply
- 10:36 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: apply
- 10:35 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
- 10:34 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
- 10:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 10:33 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
- 10:33 jmm@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
- 10:26 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 10:08 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
- 10:07 kart_: Updated cxserver to 2025-12-15-140202-production
- 09:59 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 09:59 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 09:55 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 09:54 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 09:32 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7009.*
- 09:32 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7009.*
- 09:28 fabfur: depool and disable puppet on cp7009 for haproxy qos testing (T412785)
- 09:18 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 09:18 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 09:14 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 09:13 moritzm: installing nginx security updates
- 09:13 jelto@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 09:12 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 09:09 jelto@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 09:07 jelto@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 09:06 jelto@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 09:05 jelto@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 09:04 jelto@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 08:41 elukey@deploy2002: Finished deploy [docker-pkg/deploy@4533f76]: Deploy docker-pkg (duration: 01m 08s)
- 08:40 elukey@deploy2002: Started deploy [docker-pkg/deploy@4533f76]: Deploy docker-pkg
- 08:26 moritzm: installing jq security updates
- 08:13 akosiaris@deploy2002: Finished scap sync-world: Backport for Update fc-list to point to fc-list Tool (T280718) (duration: 08m 22s)
- 08:08 akosiaris@deploy2002: akosiaris: Continuing with sync
- 08:07 akosiaris@deploy2002: akosiaris: Backport for Update fc-list to point to fc-list Tool (T280718) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:04 akosiaris@deploy2002: Started scap sync-world: Backport for Update fc-list to point to fc-list Tool (T280718)
- 06:07 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86714 and previous config saved to /var/cache/conftool/dbconfig/20251217-060706-marostegui.json
- 06:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2221.codfw.wmnet with reason: Maintenance
- 06:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86713 and previous config saved to /var/cache/conftool/dbconfig/20251217-060641-marostegui.json
- 05:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P86712 and previous config saved to /var/cache/conftool/dbconfig/20251217-055133-marostegui.json
- 05:42 eileen: civicrm upgraded from a0d1f1f7 to 28ef5eb1
- 05:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P86711 and previous config saved to /var/cache/conftool/dbconfig/20251217-053625-marostegui.json
- 05:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2222.codfw.wmnet with reason: schema change
- 05:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86710 and previous config saved to /var/cache/conftool/dbconfig/20251217-052117-marostegui.json
- 05:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 05:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86709 and previous config saved to /var/cache/conftool/dbconfig/20251217-051509-marostegui.json
- 05:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P86708 and previous config saved to /var/cache/conftool/dbconfig/20251217-050001-marostegui.json
- 04:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P86707 and previous config saved to /var/cache/conftool/dbconfig/20251217-044453-marostegui.json
- 04:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86706 and previous config saved to /var/cache/conftool/dbconfig/20251217-042943-marostegui.json
- 04:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1253 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86705 and previous config saved to /var/cache/conftool/dbconfig/20251217-042733-marostegui.json
- 04:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1253.eqiad.wmnet with reason: Maintenance
- 04:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86704 and previous config saved to /var/cache/conftool/dbconfig/20251217-042708-marostegui.json
- 04:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P86703 and previous config saved to /var/cache/conftool/dbconfig/20251217-041200-marostegui.json
- 03:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P86702 and previous config saved to /var/cache/conftool/dbconfig/20251217-035651-marostegui.json
- 03:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86701 and previous config saved to /var/cache/conftool/dbconfig/20251217-034143-marostegui.json
- 02:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2227 (T410589)', diff saved to https://phabricator.wikimedia.org/P86700 and previous config saved to /var/cache/conftool/dbconfig/20251217-025900-ladsgroup.json
- 02:58 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
- 02:58 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T410589)', diff saved to https://phabricator.wikimedia.org/P86699 and previous config saved to /var/cache/conftool/dbconfig/20251217-025835-ladsgroup.json
- 02:43 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P86698 and previous config saved to /var/cache/conftool/dbconfig/20251217-024326-ladsgroup.json
- 02:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1231 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86697 and previous config saved to /var/cache/conftool/dbconfig/20251217-024127-marostegui.json
- 02:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1231.eqiad.wmnet with reason: Maintenance
- 02:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86696 and previous config saved to /var/cache/conftool/dbconfig/20251217-024103-marostegui.json
- 02:28 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P86695 and previous config saved to /var/cache/conftool/dbconfig/20251217-022818-ladsgroup.json
- 02:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P86694 and previous config saved to /var/cache/conftool/dbconfig/20251217-022554-marostegui.json
- 02:13 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T410589)', diff saved to https://phabricator.wikimedia.org/P86693 and previous config saved to /var/cache/conftool/dbconfig/20251217-021310-ladsgroup.json
- 02:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P86692 and previous config saved to /var/cache/conftool/dbconfig/20251217-021046-marostegui.json
- 01:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86691 and previous config saved to /var/cache/conftool/dbconfig/20251217-015538-marostegui.json
- 01:25 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 24m 10s)
- 01:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
- 00:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
- 00:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 00:57 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 00:57 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 00:57 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 00:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
- 00:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
- 00:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
- 00:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/toolhub: apply
- 00:50 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
- 00:50 rzl@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
- 00:50 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
- 00:49 rzl@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
- 00:49 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 00:49 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 00:49 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 00:48 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 00:48 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 00:48 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 00:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2220 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86690 and previous config saved to /var/cache/conftool/dbconfig/20251217-004659-marostegui.json
- 00:46 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 00:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2220.codfw.wmnet with reason: Maintenance
- 00:46 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 00:46 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86689 and previous config saved to /var/cache/conftool/dbconfig/20251217-004634-marostegui.json
- 00:46 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 00:46 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 00:45 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
- 00:45 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
- 00:45 rzl@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
- 00:43 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
- 00:43 rzl@deploy2002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
- 00:43 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 00:43 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 00:43 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
- 00:43 rzl@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
- 00:43 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
- 00:42 rzl@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
- 00:42 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
- 00:42 rzl@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
- 00:41 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
- 00:41 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
- 00:39 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
- 00:39 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
- 00:39 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 00:38 rzl@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
- 00:38 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
- 00:37 rzl@deploy2002: helmfile [staging] START helmfile.d/services/media-analytics: apply
- 00:37 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
- 00:34 rzl@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
- 00:33 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
- 00:32 rzl@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
- 00:31 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
- 00:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P86688 and previous config saved to /var/cache/conftool/dbconfig/20251217-003126-marostegui.json
- 00:30 rzl@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
- 00:30 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 00:30 rzl@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 00:29 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
- 00:29 rzl@deploy2002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
- 00:29 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
- 00:28 rzl@deploy2002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
- 00:28 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
- 00:28 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
- 00:28 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
- 00:27 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
- 00:27 eileen: civicrm upgraded from 000ff848 to a0d1f1f7
- 00:27 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
- 00:27 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
- 00:27 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
- 00:27 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
- 00:26 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
- 00:26 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
- 00:26 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
- 00:26 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
- 00:25 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
- 00:25 rzl@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
- 00:25 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
- 00:25 rzl@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
- 00:24 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/echostore: apply
- 00:24 rzl@deploy2002: helmfile [staging] START helmfile.d/services/echostore: apply
- 00:24 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
- 00:24 rzl@deploy2002: helmfile [staging] START helmfile.d/services/device-analytics: apply
- 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 00:23 rzl@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
- 00:23 rzl@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
- 00:22 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 00:22 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply
- 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply
- 00:21 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
- 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
- 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
- 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
- 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
- 00:18 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/apertium: apply
- 00:17 rzl@deploy2002: helmfile [staging] START helmfile.d/services/apertium: apply
- 00:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P86687 and previous config saved to /var/cache/conftool/dbconfig/20251217-001617-marostegui.json
- 00:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86686 and previous config saved to /var/cache/conftool/dbconfig/20251217-000109-marostegui.json
2025-12-16
- 23:40 egardner@deploy2002: Finished scap sync-world: Backport for [Moderator tools] Add data-mw-interface in addition to data-mw="interface" (T409187), Delay StickyHeaders section click instrumentation for slow loads (T412857) (duration: 11m 47s)
- 23:34 egardner@deploy2002: jsn, egardner: Continuing with sync
- 23:32 egardner@deploy2002: jsn, egardner: Backport for [Moderator tools] Add data-mw-interface in addition to data-mw="interface" (T409187), Delay StickyHeaders section click instrumentation for slow loads (T412857) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:28 egardner@deploy2002: Started scap sync-world: Backport for [Moderator tools] Add data-mw-interface in addition to data-mw="interface" (T409187), Delay StickyHeaders section click instrumentation for slow loads (T412857)
- 23:04 jsn@deploy2002: Finished scap sync-world: Backport for product_metrics.special_create_account: Collect mediawiki_database (T412866) (duration: 50m 45s)
- 22:51 jsn@deploy2002: kharlan, jsn: Continuing with sync
- 22:50 jsn@deploy2002: kharlan, jsn: Backport for product_metrics.special_create_account: Collect mediawiki_database (T412866) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:41 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
- 22:40 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
- 22:13 jsn@deploy2002: Started scap sync-world: Backport for product_metrics.special_create_account: Collect mediawiki_database (T412866)
- 22:02 jsn@deploy2002: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.5,1.46.0-wmf.7,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/mediawi
- 21:58 Amir1: mwscript-k8s --follow -- findBadBlobs.php --wiki elwiki --mark "Corrupted UTF-8 (T351953)" --revisions 26381,30551 (T351953)
- 21:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86684 and previous config saved to /var/cache/conftool/dbconfig/20251216-211743-marostegui.json
- 21:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 21:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86683 and previous config saved to /var/cache/conftool/dbconfig/20251216-211718-marostegui.json
- 21:08 jsn@deploy2002: Started scap sync-world: Backport for product_metrics.special_create_account: Collect mediawiki_database (T412866)
- 21:06 eileen: civicrm upgraded from 03479639 to 000ff848
- 21:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P86682 and previous config saved to /var/cache/conftool/dbconfig/20251216-210210-marostegui.json
- 20:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P86681 and previous config saved to /var/cache/conftool/dbconfig/20251216-204701-marostegui.json
- 20:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86680 and previous config saved to /var/cache/conftool/dbconfig/20251216-203153-marostegui.json
- 20:17 dzahn@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 20:16 dzahn@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 19:45 cstone: SmashPig upgraded from 5c731f99 to 631fff60
- 19:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86678 and previous config saved to /var/cache/conftool/dbconfig/20251216-192603-marostegui.json
- 19:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2208.codfw.wmnet with reason: Maintenance
- 19:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2200.codfw.wmnet with reason: Maintenance
- 19:23 ryankemper@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on an-worker1148.eqiad.wmnet with reason: T411919
- 19:18 dancy@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.7 refs T408277
- 19:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1202 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86676 and previous config saved to /var/cache/conftool/dbconfig/20251216-191759-marostegui.json
- 19:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1202.eqiad.wmnet with reason: Maintenance
- 19:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86675 and previous config saved to /var/cache/conftool/dbconfig/20251216-191733-marostegui.json
- 19:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P86674 and previous config saved to /var/cache/conftool/dbconfig/20251216-190225-marostegui.json
- 18:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P86673 and previous config saved to /var/cache/conftool/dbconfig/20251216-184717-marostegui.json
- 18:38 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
- 18:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86672 and previous config saved to /var/cache/conftool/dbconfig/20251216-183208-marostegui.json
- 17:59 tappof: Cleaned up old files (not deleted by logrotate) on centrallog1002; removed the rsyslog-debug file on centrallog1002.
- 17:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1194 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86671 and previous config saved to /var/cache/conftool/dbconfig/20251216-171841-marostegui.json
- 17:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1194.eqiad.wmnet with reason: Maintenance
- 17:18 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host es2028
- 17:18 cmooney@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es2028
- 17:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86670 and previous config saved to /var/cache/conftool/dbconfig/20251216-171816-marostegui.json
- 17:18 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host es2028
- 17:18 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) es2028.codfw.wmnet 140.0.192.10.in-addr.arpa 0.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 17:18 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache es2028.codfw.wmnet 140.0.192.10.in-addr.arpa 0.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 17:18 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host es2028 - cmooney@cumin1003"
- 17:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host es2028 - cmooney@cumin1003"
- 17:14 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 17:14 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host es2028
- 17:14 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 17:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P86669 and previous config saved to /var/cache/conftool/dbconfig/20251216-170308-marostegui.json
- 17:01 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikibooks --logwiki=metawiki Magiuser 'Renamed user f3a49d320a6984a0d6b403d313476916' # T412784
- 16:54 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
- 16:54 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
- 16:54 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
- 16:53 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
- 16:52 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
- 16:52 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
- 16:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P86668 and previous config saved to /var/cache/conftool/dbconfig/20251216-164800-marostegui.json
- 16:47 moritzm: installing unbound security updates
- 16:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86667 and previous config saved to /var/cache/conftool/dbconfig/20251216-163252-marostegui.json
- 16:18 brett@dns1006: END - running authdns-update
- 16:15 brett@dns1006: START - running authdns-update
- 16:04 brennen@deploy2002: Finished deploy [phabricator/deployment@3a23687]: deploy phab1004 for T412825 (duration: 00m 58s)
- 16:03 brennen@deploy2002: Started deploy [phabricator/deployment@3a23687]: deploy phab1004 for T412825
- 16:03 brennen@deploy2002: Finished deploy [phabricator/deployment@3a23687]: deploy phab2002 for T412825 (duration: 00m 31s)
- 16:03 brennen@deploy2002: Started deploy [phabricator/deployment@3a23687]: deploy phab2002 for T412825
- 16:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
- 16:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
- 15:47 hashar: Restarting CI Jenkins
- 15:46 jmm@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 15:45 jmm@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 15:30 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: end frwiki A/B test (T405239) (duration: 13m 26s)
- 15:26 gehel: cleanup temp files on archiva1002
- 15:26 kharlan@deploy2002: kharlan: Continuing with sync
- 15:25 ejegg: payments-wiki upgraded from 8db01377 to 8a207d81
- 15:18 kharlan@deploy2002: kharlan: Backport for hCaptcha: end frwiki A/B test (T405239) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1191 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86665 and previous config saved to /var/cache/conftool/dbconfig/20251216-151834-marostegui.json
- 15:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1191.eqiad.wmnet with reason: Maintenance
- 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86664 and previous config saved to /var/cache/conftool/dbconfig/20251216-151809-marostegui.json
- 15:16 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: end frwiki A/B test (T405239)
- 15:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 15:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 15:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P86663 and previous config saved to /var/cache/conftool/dbconfig/20251216-150301-marostegui.json
- 14:57 Dreamy_Jazz: Afternoon UTC backport window done
- 14:56 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Pin $wgCheckUserUserAgentTableMigrationStage as SCHEMA_COMPAT_OLD (T361173) (duration: 06m 55s)
- 14:52 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 14:52 dreamyjazz@deploy2002: dreamyjazz: Backport for Pin $wgCheckUserUserAgentTableMigrationStage as SCHEMA_COMPAT_OLD (T361173) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:49 dreamyjazz@deploy2002: Started scap sync-world: Backport for Pin $wgCheckUserUserAgentTableMigrationStage as SCHEMA_COMPAT_OLD (T361173)
- 14:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P86662 and previous config saved to /var/cache/conftool/dbconfig/20251216-144752-marostegui.json
- 14:47 sbisson@deploy2002: Finished scap sync-world: Backport for CX3 Build 1.0.0+20251215 (T408842 T411779) (duration: 07m 27s)
- 14:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2198.codfw.wmnet with reason: Maintenance
- 14:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86661 and previous config saved to /var/cache/conftool/dbconfig/20251216-144533-marostegui.json
- 14:43 sbisson@deploy2002: sbisson: Continuing with sync
- 14:42 sbisson@deploy2002: sbisson: Backport for CX3 Build 1.0.0+20251215 (T408842 T411779) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:39 sbisson@deploy2002: Started scap sync-world: Backport for CX3 Build 1.0.0+20251215 (T408842 T411779)
- 14:37 sbisson@deploy2002: Finished scap sync-world: Backport for svwiki: lift autoconfirmed setting (T412713) (duration: 09m 49s)
- 14:33 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1029.eqiad.wmnet with OS trixie
- 14:33 sbisson@deploy2002: sbisson, hamishz: Continuing with sync
- 14:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86660 and previous config saved to /var/cache/conftool/dbconfig/20251216-143244-marostegui.json
- 14:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P86659 and previous config saved to /var/cache/conftool/dbconfig/20251216-143025-marostegui.json
- 14:29 sbisson@deploy2002: sbisson, hamishz: Backport for svwiki: lift autoconfirmed setting (T412713) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:29 moritzm: installing glibc security updates
- 14:27 sbisson@deploy2002: Started scap sync-world: Backport for svwiki: lift autoconfirmed setting (T412713)
- 14:26 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- 14:25 sbisson@deploy2002: Finished scap sync-world: Backport for zhwiki: enable protection indicators (T412710) (duration: 08m 05s)
- 14:24 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
- 14:21 sbisson@deploy2002: sbisson, hamishz: Continuing with sync
- 14:19 sbisson@deploy2002: sbisson, hamishz: Backport for zhwiki: enable protection indicators (T412710) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:17 sbisson@deploy2002: Started scap sync-world: Backport for zhwiki: enable protection indicators (T412710)
- 14:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P86658 and previous config saved to /var/cache/conftool/dbconfig/20251216-141517-marostegui.json
- 14:13 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 14:12 sbisson@deploy2002: Finished scap sync-world: Backport for core-Permission: Add abusefilter-access-protected-vars to temporary-account-viewer in jawiki (T412791) (duration: 07m 15s)
- 14:11 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
- 14:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2003.codfw.wmnet
- 14:10 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['es2028.codfw.wmnet']
- 14:09 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 14:08 sbisson@deploy2002: bunnypranav, sbisson: Continuing with sync
- 14:08 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
- 14:07 ayounsi@cumin1003: START - Cookbook sre.hosts.reboot-single for host sretest2003.codfw.wmnet
- 14:07 sbisson@deploy2002: bunnypranav, sbisson: Backport for core-Permission: Add abusefilter-access-protected-vars to temporary-account-viewer in jawiki (T412791) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:06 jmm@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
- 14:05 sbisson@deploy2002: Started scap sync-world: Backport for core-Permission: Add abusefilter-access-protected-vars to temporary-account-viewer in jawiki (T412791)
- 14:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2003.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:04 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2028.codfw.wmnet']
- 14:04 ayounsi@cumin1003: START - Cookbook sre.hosts.provision for host sretest2003.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:04 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['es2028.codfw.wmnet']
- 14:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 14:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86657 and previous config saved to /var/cache/conftool/dbconfig/20251216-140008-marostegui.json
- 13:56 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2028.codfw.wmnet']
- 13:55 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2028.codfw.wmnet with OS trixie
- 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['es2028.codfw.wmnet']
- 13:44 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
- 13:43 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2028.codfw.wmnet']
- 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['es2028.codfw.wmnet']
- 13:36 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2028.codfw.wmnet']
- 13:35 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es2028.codfw.wmnet']
- 13:30 Emperor: enable puppet on O:swift::proxy
- 13:29 Emperor: repool ms-fe1010
- 13:24 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es2028.codfw.wmnet']
- 13:14 Emperor: depool ms-fe1010 for testing
- 13:06 Emperor: disable puppet on O:swift::proxy
- 13:01 godog: fix network configuration and reboot cloudcephosd1052 - T399180
- 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts puppetmaster1003.eqiad.wmnet
- 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 12:52 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 12:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 12:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 12:45 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 12:38 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts puppetmaster1003.eqiad.wmnet
- 12:38 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Follow-up: SI: Add "past checks" link next to accounts in table pager (T411268) (duration: 10m 47s)
- 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 12:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es2028.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 12:32 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 12:31 dreamyjazz@deploy2002: dreamyjazz: Backport for Follow-up: SI: Add "past checks" link next to accounts in table pager (T411268) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 12:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 12:27 dreamyjazz@deploy2002: Started scap sync-world: Backport for Follow-up: SI: Add "past checks" link next to accounts in table pager (T411268)
- 12:27 marostegui@cumin1003: START - Cookbook sre.hosts.provision for host es2028.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 12:15 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es2028.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 12:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 12:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 12:08 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Remove definition of wgGlobalBlockingEnableAutoblocks (T379086), Show global autoblocks in the globalblocks list API response (T379087) (duration: 67m 55s)
- 11:54 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 11:50 dreamyjazz@deploy2002: dreamyjazz: Backport for Remove definition of wgGlobalBlockingEnableAutoblocks (T379086), Show global autoblocks in the globalblocks list API response (T379087) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 11:40 marostegui@cumin1003: START - Cookbook sre.hosts.provision for host es2028.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 11:39 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS trixie
- 11:22 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2028.codfw.wmnet with OS trixie
- 11:19 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 11:15 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 11:04 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:fixLinkRecommendationData --wiki=itwiki --dry-run --search-index --db-table # T412040-fix-dryrun-02
- 11:00 dreamyjazz@deploy2002: Started scap sync-world: Backport for Remove definition of wgGlobalBlockingEnableAutoblocks (T379086), Show global autoblocks in the globalblocks list API response (T379087)
- 10:58 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
- 10:58 mwpresync@deploy2002: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.5,1.46.0-wmf.7,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/m
- 10:57 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1029.eqiad.wmnet with OS trixie
- 10:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 10:45 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2028.codfw.wmnet with OS trixie
- 10:44 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
- 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2028.codfw.wmnet with OS trixie
- 10:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2028.codfw.wmnet with reason: reimage
- 10:05 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs1029.eqiad.wmnet with OS trixie
- 10:05 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.7 refs T408277
- 10:04 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
- 10:03 hashar: Started MediaWiki train task `train-presync`. It did not run overnight due to a CI failure | T408277
- 09:58 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
- 09:54 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
- 09:46 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
- 09:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 09:40 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 09:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2209 (T410589)', diff saved to https://phabricator.wikimedia.org/P86654 and previous config saved to /var/cache/conftool/dbconfig/20251216-093745-ladsgroup.json
- 09:37 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
- 09:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T410589)', diff saved to https://phabricator.wikimedia.org/P86653 and previous config saved to /var/cache/conftool/dbconfig/20251216-093720-ladsgroup.json
- 09:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P86652 and previous config saved to /var/cache/conftool/dbconfig/20251216-092212-ladsgroup.json
- 09:22 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
- 09:21 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add primary IP to ps1-e10-eqiad - ayounsi@cumin1003"
- 09:20 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add primary IP to ps1-e10-eqiad - ayounsi@cumin1003"
- 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts puppetmaster2002.codfw.wmnet
- 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P86651 and previous config saved to /var/cache/conftool/dbconfig/20251216-090704-ladsgroup.json
- 09:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: puppetmaster2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:01 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:55 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts puppetmaster2002.codfw.wmnet
- 08:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T410589)', diff saved to https://phabricator.wikimedia.org/P86650 and previous config saved to /var/cache/conftool/dbconfig/20251216-085155-ladsgroup.json
- 08:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86649 and previous config saved to /var/cache/conftool/dbconfig/20251216-084817-marostegui.json
- 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 08:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86648 and previous config saved to /var/cache/conftool/dbconfig/20251216-084752-marostegui.json
- 08:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P86647 and previous config saved to /var/cache/conftool/dbconfig/20251216-083243-marostegui.json
- 08:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P86646 and previous config saved to /var/cache/conftool/dbconfig/20251216-081735-marostegui.json
- 08:11 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS bookworm
- 08:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86645 and previous config saved to /var/cache/conftool/dbconfig/20251216-080227-marostegui.json
- 07:37 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 07:36 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1181 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86644 and previous config saved to /var/cache/conftool/dbconfig/20251216-072114-marostegui.json
- 07:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
- 07:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86643 and previous config saved to /var/cache/conftool/dbconfig/20251216-072049-marostegui.json
- 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P86642 and previous config saved to /var/cache/conftool/dbconfig/20251216-070542-marostegui.json
- 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P86641 and previous config saved to /var/cache/conftool/dbconfig/20251216-065033-marostegui.json
- 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86640 and previous config saved to /var/cache/conftool/dbconfig/20251216-063525-marostegui.json
- 05:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86639 and previous config saved to /var/cache/conftool/dbconfig/20251216-051607-marostegui.json
- 05:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 02:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86638 and previous config saved to /var/cache/conftool/dbconfig/20251216-025200-marostegui.json
- 02:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 02:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86637 and previous config saved to /var/cache/conftool/dbconfig/20251216-025136-marostegui.json
- 02:50 ladsgroup@deploy2002: Finished scap sync-world: Backport for SpecialLinkSearch: Add a message when domains are being ignored (T405005) (duration: 38m 47s)
- 02:37 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 02:36 ladsgroup@deploy2002: ladsgroup: Backport for SpecialLinkSearch: Add a message when domains are being ignored (T405005) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 02:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P86636 and previous config saved to /var/cache/conftool/dbconfig/20251216-023627-marostegui.json
- 02:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P86635 and previous config saved to /var/cache/conftool/dbconfig/20251216-022119-marostegui.json
- 02:11 ladsgroup@deploy2002: Started scap sync-world: Backport for SpecialLinkSearch: Add a message when domains are being ignored (T405005)
- 02:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86634 and previous config saved to /var/cache/conftool/dbconfig/20251216-020611-marostegui.json
- 01:01 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 01m 15s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:53 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 00:52 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 00:50 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 00:50 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 00:49 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 00:48 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 00:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 00:45 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 00:44 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 00:43 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 00:42 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 00:41 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
2025-12-15
- 23:46 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 23:46 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 23:37 kemayo@deploy2002: Finished scap sync-world: Backport for mobileSectionSwitch: make sure session ID isn't regenerated each time (T410803) (duration: 07m 17s)
- 23:33 kemayo@deploy2002: kemayo: Continuing with sync
- 23:32 kemayo@deploy2002: kemayo: Backport for mobileSectionSwitch: make sure session ID isn't regenerated each time (T410803) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:30 kemayo@deploy2002: Started scap sync-world: Backport for mobileSectionSwitch: make sure session ID isn't regenerated each time (T410803)
- 23:21 musikanimal@deploy2002: Finished scap sync-world: Backport for CodeMirrorWikiEditor: add style module to prevent FOUCs (duration: 07m 03s)
- 23:17 musikanimal@deploy2002: musikanimal: Continuing with sync
- 23:16 musikanimal@deploy2002: musikanimal: Backport for CodeMirrorWikiEditor: add style module to prevent FOUCs synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:14 musikanimal@deploy2002: Started scap sync-world: Backport for CodeMirrorWikiEditor: add style module to prevent FOUCs
- 23:03 musikanimal@deploy2002: Finished scap sync-world: Backport for Use CodeMirror instead of CodeEditor for beta feature users (T373711) (duration: 12m 17s)
- 22:59 musikanimal@deploy2002: musikanimal: Continuing with sync
- 22:56 musikanimal@deploy2002: musikanimal: Backport for Use CodeMirror instead of CodeEditor for beta feature users (T373711) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 22:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86633 and previous config saved to /var/cache/conftool/dbconfig/20251215-225317-marostegui.json
- 22:51 musikanimal@deploy2002: Started scap sync-world: Backport for Use CodeMirror instead of CodeEditor for beta feature users (T373711)
- 22:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P86632 and previous config saved to /var/cache/conftool/dbconfig/20251215-223808-marostegui.json
- 22:23 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases2003.codfw.wmnet with reason: T289858
- 22:23 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases1003.eqiad.wmnet with reason: T289858
- 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P86631 and previous config saved to /var/cache/conftool/dbconfig/20251215-222300-marostegui.json
- 22:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86630 and previous config saved to /var/cache/conftool/dbconfig/20251215-220751-marostegui.json
- 21:44 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases1003.eqiad.wmnet with reason: T289858
- 21:44 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases2003.codfw.wmnet with reason: T289858
- 20:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86629 and previous config saved to /var/cache/conftool/dbconfig/20251215-205643-marostegui.json
- 20:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 20:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86628 and previous config saved to /var/cache/conftool/dbconfig/20251215-205616-marostegui.json
- 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P86627 and previous config saved to /var/cache/conftool/dbconfig/20251215-204108-marostegui.json
- 20:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P86626 and previous config saved to /var/cache/conftool/dbconfig/20251215-202559-marostegui.json
- 20:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86625 and previous config saved to /var/cache/conftool/dbconfig/20251215-201051-marostegui.json
- 20:04 ejegg: payments-wiki upgraded from fc18b3c0 to 8db01377
- 18:13 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1217610 T410975 (duration: 05m 20s)
- 18:12 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases2003.codfw.wmnet with reason: T289858
- 18:11 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases1003.eqiad.wmnet with reason: T289858
- 18:08 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1217610 T410975
- 17:27 kemayo@deploy2002: Finished scap sync-world: Backport for mobileSectionSwitch: make sure the config gets adjusted earlier (T410803) (duration: 06m 46s)
- 17:27 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 17:27 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 17:23 kemayo@deploy2002: kemayo: Continuing with sync
- 17:22 kemayo@deploy2002: kemayo: Backport for mobileSectionSwitch: make sure the config gets adjusted earlier (T410803) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:21 kemayo@deploy2002: Started scap sync-world: Backport for mobileSectionSwitch: make sure the config gets adjusted earlier (T410803)
- 17:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 17:04 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 17:00 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:fixLinkRecommendationData --wiki=itwiki --dry-run --search-index --db-table # T412040-fix-dryrun
- 16:59 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 16:59 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 16:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86623 and previous config saved to /var/cache/conftool/dbconfig/20251215-165926-marostegui.json
- 16:59 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 16:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86622 and previous config saved to /var/cache/conftool/dbconfig/20251215-165901-marostegui.json
- 16:44 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS bookworm
- 16:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P86621 and previous config saved to /var/cache/conftool/dbconfig/20251215-164353-marostegui.json
- 16:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P86620 and previous config saved to /var/cache/conftool/dbconfig/20251215-162844-marostegui.json
- 16:23 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 16:23 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 16:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86619 and previous config saved to /var/cache/conftool/dbconfig/20251215-161335-marostegui.json
- 15:56 hashar@deploy2002: Finished deploy [releng/jenkins-deploy@863e5c2] (releasing): Update git-client plugin # T412694 (duration: 01m 19s)
- 15:55 hashar@deploy2002: Started deploy [releng/jenkins-deploy@863e5c2] (releasing): Update git-client plugin # T412694
- 15:44 ladsgroup@deploy2002: Finished scap sync-world: Backport for Set wgExternalLinksIgnoreDomains in production (T405005) (duration: 08m 20s)
- 15:40 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 15:40 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2194 (T410589)', diff saved to https://phabricator.wikimedia.org/P86618 and previous config saved to /var/cache/conftool/dbconfig/20251215-153957-ladsgroup.json
- 15:39 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
- 15:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T410589)', diff saved to https://phabricator.wikimedia.org/P86617 and previous config saved to /var/cache/conftool/dbconfig/20251215-153933-ladsgroup.json
- 15:39 hashar: Upgrading CI Jenkins
- 15:38 ladsgroup@deploy2002: ladsgroup: Backport for Set wgExternalLinksIgnoreDomains in production (T405005) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:36 ladsgroup@deploy2002: Started scap sync-world: Backport for Set wgExternalLinksIgnoreDomains in production (T405005)
- 15:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P86616 and previous config saved to /var/cache/conftool/dbconfig/20251215-152425-ladsgroup.json
- 15:09 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P86615 and previous config saved to /var/cache/conftool/dbconfig/20251215-150916-ladsgroup.json
- 15:06 ladsgroup@deploy2002: Finished scap sync-world: Backport for ParserOutput: Allow for ignoring a set of domains for externallinks (T405005) (duration: 08m 26s)
- 15:02 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 15:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86614 and previous config saved to /var/cache/conftool/dbconfig/20251215-150111-marostegui.json
- 15:01 ladsgroup@deploy2002: ladsgroup: Backport for ParserOutput: Allow for ignoring a set of domains for externallinks (T405005) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 15:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 15:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2150 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86613 and previous config saved to /var/cache/conftool/dbconfig/20251215-150027-marostegui.json
- 15:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 14:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repool db2148', diff saved to https://phabricator.wikimedia.org/P86612 and previous config saved to /var/cache/conftool/dbconfig/20251215-145943-marostegui.json
- 14:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2148 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86611 and previous config saved to /var/cache/conftool/dbconfig/20251215-145836-marostegui.json
- 14:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 14:58 ladsgroup@deploy2002: Started scap sync-world: Backport for ParserOutput: Allow for ignoring a set of domains for externallinks (T405005)
- 14:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T410589)', diff saved to https://phabricator.wikimedia.org/P86610 and previous config saved to /var/cache/conftool/dbconfig/20251215-145408-ladsgroup.json
- 14:45 urbanecm@deploy2002: Finished scap sync-world: Backport for enwiki: Enable HTML confirmation email (T410970), Enable HTML confirmation email on Wikidata and Commons (T410971) (duration: 10m 13s)
- 14:41 urbanecm@deploy2002: urbanecm: Continuing with sync
- 14:37 urbanecm@deploy2002: urbanecm: Backport for enwiki: Enable HTML confirmation email (T410970), Enable HTML confirmation email on Wikidata and Commons (T410971) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:35 urbanecm@deploy2002: Started scap sync-world: Backport for enwiki: Enable HTML confirmation email (T410970), Enable HTML confirmation email on Wikidata and Commons (T410971)
- 14:34 urbanecm@deploy2002: Finished scap sync-world: Backport for niawiktionary: update logo (T411850) (duration: 07m 27s)
- 14:30 urbanecm@deploy2002: urbanecm, anzx: Continuing with sync
- 14:29 urbanecm@deploy2002: urbanecm, anzx: Backport for niawiktionary: update logo (T411850) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:27 urbanecm@deploy2002: Started scap sync-world: Backport for niawiktionary: update logo (T411850)
- 14:20 ihurbain@deploy2002: Finished scap sync-world: Backport for Activate post-processing cache on idwiki (T348255) (duration: 12m 07s)
- 14:14 ihurbain@deploy2002: ihurbain: Continuing with sync
- 14:12 ihurbain@deploy2002: ihurbain: Backport for Activate post-processing cache on idwiki (T348255) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:09 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
- 14:08 ihurbain@deploy2002: Started scap sync-world: Backport for Activate post-processing cache on idwiki (T348255)
- 14:06 urbanecm@deploy2002: Finished scap sync-world: Backport for Revert^4 "Confirmation email: further styling adjustments" (T411526), Revert^4 "i18n: replace <> to avoid false positive export errors" (T411526) (duration: 47m 56s)
- 14:06 moritzm: installing libpng1.6 security updates
- 14:03 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
- 13:54 urbanecm@deploy2002: urbanecm: Continuing with sync
- 13:51 urbanecm@deploy2002: urbanecm: Backport for Revert^4 "Confirmation email: further styling adjustments" (T411526), Revert^4 "i18n: replace <> to avoid false positive export errors" (T411526) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:42 hashar@deploy2002: Finished deploy [releng/jenkins-deploy@7a991f5] (releasing): Redeploy to upgrade releases Jenkins T412694 (duration: 02m 05s)
- 13:41 hashar@deploy2002: Started deploy [releng/jenkins-deploy@7a991f5] (releasing): Redeploy to upgrade releases Jenkins T412694
- 13:28 Amir1: dropped databases of deleted wikis in x1 (T411835)
- 13:18 urbanecm@deploy2002: Started scap sync-world: Backport for Revert^4 "Confirmation email: further styling adjustments" (T411526), Revert^4 "i18n: replace <> to avoid false positive export errors" (T411526)
- 13:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 13:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 12:52 moritzm: imported jenkins 2.528.3 to thirdparty/jenkins for bookworm-wikimedia T412694
- 12:52 moritzm: imported jenkins 2.528.3 to thirdparty/ci for bullseye-wikimedia T412694
- 12:48 kharlan@deploy2002: Finished scap sync-world: Backport for Add acct_creation_throttle_hit equivalent for temp. accounts (T412105), Add 'acct_creation_throttle_hit-temp' (T412105), Add experimental temporary account creation rate limits (T412222) (duration: 72m 23s)
- 12:35 kharlan@deploy2002: kharlan, tchanders: Continuing with sync
- 12:31 kharlan@deploy2002: kharlan, tchanders: Backport for Add acct_creation_throttle_hit equivalent for temp. accounts (T412105), Add 'acct_creation_throttle_hit-temp' (T412105), Add experimental temporary account creation rate limits (T412222) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:30 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:fixLinkRecommendationData --wiki=itwiki --force --search-index --db-table # T412040-fix-02
- 12:06 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:fixLinkRecommendationData --wiki=itwiki --force --search-index --db-table # T412040-fix
- 11:57 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:fixLinkRecommendationData --wiki=itwiki --dry-run --search-index --db-table # T412040-fix-dryrun
- 11:56 marostegui: Deploy schema change on x1 with replication https://phabricator.wikimedia.org/T410241
- 11:56 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments::fixLinkRecommendationData --wiki=itwiki --dry-run --search-index --db-table # T412040-fix-dryrun
- 11:36 kharlan@deploy2002: Started scap sync-world: Backport for Add acct_creation_throttle_hit equivalent for temp. accounts (T412105), Add 'acct_creation_throttle_hit-temp' (T412105), Add experimental temporary account creation rate limits (T412222)
- 11:30 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 11:30 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 11:03 marostegui: Deploy schema change on x1 with replication https://phabricator.wikimedia.org/T412687
- 11:00 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 11:00 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:45 phuedx@deploy2002: Finished scap sync-world: Backport for product_metrics.contributors.experiments stream needs use_edge_uniques (T405177 T410803), mobileSectionSwitch: experiment name change (T410803) (duration: 50m 59s)
- 10:38 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
- 10:36 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
- 10:33 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: apply
- 10:33 phuedx@deploy2002: kemayo, phuedx: Continuing with sync
- 10:31 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: apply
- 10:29 phuedx@deploy2002: kemayo, phuedx: Backport for product_metrics.contributors.experiments stream needs use_edge_uniques (T405177 T410803), mobileSectionSwitch: experiment name change (T410803) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:27 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: apply
- 10:20 elukey: spicerack upgraded to 12.2.0 on cumin nodes
- 10:16 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: apply
- 10:09 ihurbain@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: apply
- 09:58 ihurbain@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: apply
- 09:54 phuedx@deploy2002: Started scap sync-world: Backport for product_metrics.contributors.experiments stream needs use_edge_uniques (T405177 T410803), mobileSectionSwitch: experiment name change (T410803)
- 09:46 elukey: uploaded spicerack_12.2.0 to apt.wikimedia.org bullseye-wikimedia,bookworm-wikimedia
- 09:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2204.codfw.wmnet with reason: Maintenance
- 06:33 marostegui: Stop MariaDB on sretest2003 - T411173
- 06:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove sretest2003 from dbctl T411173', diff saved to https://phabricator.wikimedia.org/P86607 and previous config saved to /var/cache/conftool/dbconfig/20251215-062645-marostegui.json
- 06:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
- 02:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 02:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86606 and previous config saved to /var/cache/conftool/dbconfig/20251215-025448-marostegui.json
- 02:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P86605 and previous config saved to /var/cache/conftool/dbconfig/20251215-023940-marostegui.json
- 02:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P86604 and previous config saved to /var/cache/conftool/dbconfig/20251215-022431-marostegui.json
- 02:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86603 and previous config saved to /var/cache/conftool/dbconfig/20251215-020923-marostegui.json
- 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 09s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-14
- 21:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T410589)', diff saved to https://phabricator.wikimedia.org/P86602 and previous config saved to /var/cache/conftool/dbconfig/20251214-212240-ladsgroup.json
- 21:22 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
- 21:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T410589)', diff saved to https://phabricator.wikimedia.org/P86601 and previous config saved to /var/cache/conftool/dbconfig/20251214-212226-ladsgroup.json
- 21:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P86600 and previous config saved to /var/cache/conftool/dbconfig/20251214-210717-ladsgroup.json
- 20:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P86599 and previous config saved to /var/cache/conftool/dbconfig/20251214-205208-ladsgroup.json
- 20:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T410589)', diff saved to https://phabricator.wikimedia.org/P86598 and previous config saved to /var/cache/conftool/dbconfig/20251214-203700-ladsgroup.json
- 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1259 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86597 and previous config saved to /var/cache/conftool/dbconfig/20251214-201213-marostegui.json
- 20:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
- 20:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86596 and previous config saved to /var/cache/conftool/dbconfig/20251214-201148-marostegui.json
- 19:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P86595 and previous config saved to /var/cache/conftool/dbconfig/20251214-195640-marostegui.json
- 19:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P86594 and previous config saved to /var/cache/conftool/dbconfig/20251214-194132-marostegui.json
- 19:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86593 and previous config saved to /var/cache/conftool/dbconfig/20251214-192623-marostegui.json
- 14:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86592 and previous config saved to /var/cache/conftool/dbconfig/20251214-145800-marostegui.json
- 14:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P86591 and previous config saved to /var/cache/conftool/dbconfig/20251214-144251-marostegui.json
- 14:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P86590 and previous config saved to /var/cache/conftool/dbconfig/20251214-142743-marostegui.json
- 14:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86589 and previous config saved to /var/cache/conftool/dbconfig/20251214-141235-marostegui.json
- 13:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1254 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86588 and previous config saved to /var/cache/conftool/dbconfig/20251214-132817-marostegui.json
- 13:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
- 08:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2238 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86587 and previous config saved to /var/cache/conftool/dbconfig/20251214-083116-marostegui.json
- 08:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2238.codfw.wmnet with reason: Maintenance
- 08:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86586 and previous config saved to /var/cache/conftool/dbconfig/20251214-083051-marostegui.json
- 08:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P86585 and previous config saved to /var/cache/conftool/dbconfig/20251214-081543-marostegui.json
- 08:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P86584 and previous config saved to /var/cache/conftool/dbconfig/20251214-080034-marostegui.json
- 07:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86583 and previous config saved to /var/cache/conftool/dbconfig/20251214-074526-marostegui.json
- 07:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 07:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86582 and previous config saved to /var/cache/conftool/dbconfig/20251214-073957-marostegui.json
- 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P86581 and previous config saved to /var/cache/conftool/dbconfig/20251214-072449-marostegui.json
- 07:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P86580 and previous config saved to /var/cache/conftool/dbconfig/20251214-070940-marostegui.json
- 06:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86579 and previous config saved to /var/cache/conftool/dbconfig/20251214-065432-marostegui.json
- 06:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2226 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86578 and previous config saved to /var/cache/conftool/dbconfig/20251214-062752-marostegui.json
- 06:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2226.codfw.wmnet with reason: Maintenance
- 06:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86577 and previous config saved to /var/cache/conftool/dbconfig/20251214-062727-marostegui.json
- 06:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P86576 and previous config saved to /var/cache/conftool/dbconfig/20251214-061219-marostegui.json
- 05:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P86575 and previous config saved to /var/cache/conftool/dbconfig/20251214-055711-marostegui.json
- 05:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86574 and previous config saved to /var/cache/conftool/dbconfig/20251214-054202-marostegui.json
- 03:20 eileen: civicrm upgraded from 8a0822ef to 03479639
- 01:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T410589)', diff saved to https://phabricator.wikimedia.org/P86573 and previous config saved to /var/cache/conftool/dbconfig/20251214-015920-ladsgroup.json
- 01:59 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
- 01:58 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T410589)', diff saved to https://phabricator.wikimedia.org/P86572 and previous config saved to /var/cache/conftool/dbconfig/20251214-015856-ladsgroup.json
- 01:43 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P86571 and previous config saved to /var/cache/conftool/dbconfig/20251214-014348-ladsgroup.json
- 01:30 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 29m 42s)
- 01:28 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P86570 and previous config saved to /var/cache/conftool/dbconfig/20251214-012839-ladsgroup.json
- 01:13 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T410589)', diff saved to https://phabricator.wikimedia.org/P86569 and previous config saved to /var/cache/conftool/dbconfig/20251214-011331-ladsgroup.json
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1233 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86568 and previous config saved to /var/cache/conftool/dbconfig/20251214-005607-marostegui.json
- 00:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 00:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86567 and previous config saved to /var/cache/conftool/dbconfig/20251214-005542-marostegui.json
- 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P86566 and previous config saved to /var/cache/conftool/dbconfig/20251214-004034-marostegui.json
- 00:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P86565 and previous config saved to /var/cache/conftool/dbconfig/20251214-002526-marostegui.json
- 00:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86564 and previous config saved to /var/cache/conftool/dbconfig/20251214-001017-marostegui.json
2025-12-13
- 23:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2225 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86563 and previous config saved to /var/cache/conftool/dbconfig/20251213-235427-marostegui.json
- 23:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2225.codfw.wmnet with reason: Maintenance
- 23:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86562 and previous config saved to /var/cache/conftool/dbconfig/20251213-235413-marostegui.json
- 23:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P86561 and previous config saved to /var/cache/conftool/dbconfig/20251213-233905-marostegui.json
- 23:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P86560 and previous config saved to /var/cache/conftool/dbconfig/20251213-232356-marostegui.json
- 23:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86559 and previous config saved to /var/cache/conftool/dbconfig/20251213-230848-marostegui.json
- 17:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1229 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86558 and previous config saved to /var/cache/conftool/dbconfig/20251213-175442-marostegui.json
- 17:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2207 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86557 and previous config saved to /var/cache/conftool/dbconfig/20251213-171057-marostegui.json
- 17:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance
- 12:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 12:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86556 and previous config saved to /var/cache/conftool/dbconfig/20251213-120425-marostegui.json
- 11:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P86555 and previous config saved to /var/cache/conftool/dbconfig/20251213-114916-marostegui.json
- 11:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P86554 and previous config saved to /var/cache/conftool/dbconfig/20251213-113408-marostegui.json
- 11:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2197.codfw.wmnet with reason: Maintenance
- 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86553 and previous config saved to /var/cache/conftool/dbconfig/20251213-112229-marostegui.json
- 11:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86552 and previous config saved to /var/cache/conftool/dbconfig/20251213-111900-marostegui.json
- 11:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P86551 and previous config saved to /var/cache/conftool/dbconfig/20251213-110720-marostegui.json
- 10:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P86550 and previous config saved to /var/cache/conftool/dbconfig/20251213-105212-marostegui.json
- 10:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86549 and previous config saved to /var/cache/conftool/dbconfig/20251213-103704-marostegui.json
- 09:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1197 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86548 and previous config saved to /var/cache/conftool/dbconfig/20251213-094944-marostegui.json
- 09:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 09:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86547 and previous config saved to /var/cache/conftool/dbconfig/20251213-094920-marostegui.json
- 09:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P86546 and previous config saved to /var/cache/conftool/dbconfig/20251213-093412-marostegui.json
- 09:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P86545 and previous config saved to /var/cache/conftool/dbconfig/20251213-091903-marostegui.json
- 09:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86544 and previous config saved to /var/cache/conftool/dbconfig/20251213-090355-marostegui.json
- 07:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1188 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86543 and previous config saved to /var/cache/conftool/dbconfig/20251213-073445-marostegui.json
- 07:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 07:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86542 and previous config saved to /var/cache/conftool/dbconfig/20251213-073421-marostegui.json
- 07:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P86541 and previous config saved to /var/cache/conftool/dbconfig/20251213-071913-marostegui.json
- 07:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P86540 and previous config saved to /var/cache/conftool/dbconfig/20251213-070405-marostegui.json
- 06:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86539 and previous config saved to /var/cache/conftool/dbconfig/20251213-064856-marostegui.json
- 06:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T410589)', diff saved to https://phabricator.wikimedia.org/P86538 and previous config saved to /var/cache/conftool/dbconfig/20251213-063023-ladsgroup.json
- 06:30 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 06:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T410589)', diff saved to https://phabricator.wikimedia.org/P86537 and previous config saved to /var/cache/conftool/dbconfig/20251213-062958-ladsgroup.json
- 06:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P86536 and previous config saved to /var/cache/conftool/dbconfig/20251213-061450-ladsgroup.json
- 05:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P86535 and previous config saved to /var/cache/conftool/dbconfig/20251213-055942-ladsgroup.json
- 05:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T410589)', diff saved to https://phabricator.wikimedia.org/P86534 and previous config saved to /var/cache/conftool/dbconfig/20251213-054433-ladsgroup.json
- 05:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2189 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86533 and previous config saved to /var/cache/conftool/dbconfig/20251213-050223-marostegui.json
- 05:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2189.codfw.wmnet with reason: Maintenance
- 05:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86532 and previous config saved to /var/cache/conftool/dbconfig/20251213-050158-marostegui.json
- 04:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P86531 and previous config saved to /var/cache/conftool/dbconfig/20251213-044649-marostegui.json
- 04:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P86530 and previous config saved to /var/cache/conftool/dbconfig/20251213-043141-marostegui.json
- 04:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86529 and previous config saved to /var/cache/conftool/dbconfig/20251213-041633-marostegui.json
- 01:18 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 17m 42s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-12
- 23:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1182 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86528 and previous config saved to /var/cache/conftool/dbconfig/20251212-233453-marostegui.json
- 23:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 23:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86527 and previous config saved to /var/cache/conftool/dbconfig/20251212-233428-marostegui.json
- 23:22 tzatziki: removing 1 file for legal compliance
- 23:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P86526 and previous config saved to /var/cache/conftool/dbconfig/20251212-231920-marostegui.json
- 23:16 tzatziki: removing 4 files for legal compliance
- 23:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P86525 and previous config saved to /var/cache/conftool/dbconfig/20251212-230412-marostegui.json
- 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86524 and previous config saved to /var/cache/conftool/dbconfig/20251212-224903-marostegui.json
- 21:32 tzatziki: removing 4 files for legal compliance
- 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2175 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86523 and previous config saved to /var/cache/conftool/dbconfig/20251212-212305-marostegui.json
- 21:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 21:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86522 and previous config saved to /var/cache/conftool/dbconfig/20251212-212240-marostegui.json
- 21:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1162 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86521 and previous config saved to /var/cache/conftool/dbconfig/20251212-211309-marostegui.json
- 21:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 21:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86520 and previous config saved to /var/cache/conftool/dbconfig/20251212-211245-marostegui.json
- 21:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P86519 and previous config saved to /var/cache/conftool/dbconfig/20251212-210731-marostegui.json
- 20:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P86518 and previous config saved to /var/cache/conftool/dbconfig/20251212-205737-marostegui.json
- 20:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P86517 and previous config saved to /var/cache/conftool/dbconfig/20251212-205223-marostegui.json
- 20:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P86516 and previous config saved to /var/cache/conftool/dbconfig/20251212-204228-marostegui.json
- 20:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86515 and previous config saved to /var/cache/conftool/dbconfig/20251212-203715-marostegui.json
- 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86514 and previous config saved to /var/cache/conftool/dbconfig/20251212-202720-marostegui.json
- 19:35 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet with OS bookworm
- 19:35 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 17:48 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 17:30 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker2005.codfw.wmnet with reason: host reimage
- 17:26 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker2005.codfw.wmnet with reason: host reimage
- 17:16 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker2005.codfw.wmnet with OS bookworm
- 16:54 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 16:54 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 15:55 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 15:55 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 15:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 15:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 14:51 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-jumbo1002.eqiad.wmnet with OS trixie
- 14:51 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 14:51 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-jumbo1003.eqiad.wmnet with OS trixie
- 14:51 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 14:34 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 14:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 14:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti-jumbo1003.eqiad.wmnet with reason: host reimage
- 14:13 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti-jumbo1003.eqiad.wmnet with reason: host reimage
- 14:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti-jumbo1002.eqiad.wmnet with reason: host reimage
- 14:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo1003.eqiad.wmnet with OS trixie
- 14:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:00 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti-jumbo1002.eqiad.wmnet with reason: host reimage
- 13:50 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 13:50 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo1002.eqiad.wmnet with OS trixie
- 13:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 13:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2148 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86512 and previous config saved to /var/cache/conftool/dbconfig/20251212-134125-marostegui.json
- 13:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 13:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 13:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 13:38 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 13:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 13:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 13:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 13:22 gehel@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS trixie
- 13:22 gehel@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
- 13:14 jgleeson: payments-wiki upgraded from 99671dda to fc18b3c0
- 10:47 gehel@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1032.eqiad.wmnet with OS trixie
- 10:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2149 (T410589)', diff saved to https://phabricator.wikimedia.org/P86511 and previous config saved to /var/cache/conftool/dbconfig/20251212-103907-ladsgroup.json
- 10:38 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 10:16 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 10:16 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 10:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 10:10 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 09:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 09:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 09:19 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 09:19 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 09:14 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
- 09:09 gehel@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
- 09:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 09:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 08:55 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 8560
- 08:54 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 8560
- 08:51 gehel@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS trixie
- 08:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 14537
- 08:36 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 14537
- 08:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12709
- 08:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12709
- 02:20 ejegg: donorwiki upgraded from bbd96c00 to 99671dda
- 02:08 ejegg: payments-wiki upgraded from 460c2f5d to 99671dda
- 01:30 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 29m 34s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:36 larssandergreen: Updating civicrm from 1a5626c4 to 8a0822ef
2025-12-11
- 23:27 kemayo@deploy2002: Finished scap sync-world: Backport for Add product_metrics.contributors.experiments to wgMetricsPlatformExperimentStreamNames (T405177 T410803) (duration: 07m 38s)
- 23:25 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker2005.codfw.wmnet with OS bookworm
- 23:23 kemayo@deploy2002: kemayo: Continuing with sync
- 23:22 kemayo@deploy2002: kemayo: Backport for Add product_metrics.contributors.experiments to wgMetricsPlatformExperimentStreamNames (T405177 T410803) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:20 kemayo@deploy2002: Started scap sync-world: Backport for Add product_metrics.contributors.experiments to wgMetricsPlatformExperimentStreamNames (T405177 T410803)
- 22:40 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet with OS bookworm
- 22:40 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 22:39 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 22:30 maryum: Deployed security fix for T411305
- 22:19 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker2004.codfw.wmnet with reason: host reimage
- 22:16 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker2004.codfw.wmnet with reason: host reimage
- 22:05 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker2005.codfw.wmnet with OS bookworm
- 22:05 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker2004.codfw.wmnet with OS bookworm
- 21:54 dani@deploy2002: Finished scap sync-world: Backport for Undeploy 2025 Global Readers Survey (T410918), Test Kitchen: StickyHeaders experiment hotfix (T412146) (duration: 09m 07s)
- 21:52 gehel@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1031.eqiad.wmnet with OS trixie
- 21:49 dani@deploy2002: dani, cjming: Continuing with sync
- 21:47 dani@deploy2002: dani, cjming: Backport for Undeploy 2025 Global Readers Survey (T410918), Test Kitchen: StickyHeaders experiment hotfix (T412146) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:45 dani@deploy2002: Started scap sync-world: Backport for Undeploy 2025 Global Readers Survey (T410918), Test Kitchen: StickyHeaders experiment hotfix (T412146)
- 21:35 urbanecm@deploy2002: Finished scap sync-world: Backport for Revert^3 "Confirmation email: further styling adjustments" (T411526), Revert^3 "i18n: replace <> to avoid false positive export errors" (T411526) (duration: 46m 04s)
- 21:22 urbanecm@deploy2002: urbanecm: Continuing with sync
- 21:22 urbanecm@deploy2002: urbanecm: Backport for Revert^3 "Confirmation email: further styling adjustments" (T411526), Revert^3 "i18n: replace <> to avoid false positive export errors" (T411526) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:17 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 21:17 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 21:14 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti-jumbo1002.eqiad.wmnet with OS trixie
- 21:10 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 21:09 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
- 20:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 20:56 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 20:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet with OS trixie
- 20:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:49 urbanecm@deploy2002: Started scap sync-world: Backport for Revert^3 "Confirmation email: further styling adjustments" (T411526), Revert^3 "i18n: replace <> to avoid false positive export errors" (T411526)
- 20:40 urbanecm@deploy2002: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.4,1.46.0-wmf.5,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/me
- 20:38 brett@dns1006: END - running authdns-update
- 20:37 brett@dns1006: START - running authdns-update
- 20:34 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
- 20:28 gehel@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage
- 20:25 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:10 gehel@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1031.eqiad.wmnet with OS trixie
- 20:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti-jumbo1001.eqiad.wmnet with reason: host reimage
- 20:02 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti-jumbo1001.eqiad.wmnet with reason: host reimage
- 19:54 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo1002.eqiad.wmnet with OS trixie
- 19:51 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo1001.eqiad.wmnet with OS trixie
- 19:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti-jumbo1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:47 urbanecm@deploy2002: Started scap sync-world: Backport for Revert^2 "Confirmation email: further styling adjustments" (T411526), Revert^2 "i18n: replace <> to avoid false positive export errors" (T411526)
- 19:46 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] Enable Add Link backend on a handful of wikis (T410469) (duration: 08m 55s)
- 19:42 urbanecm@deploy2002: urbanecm: Continuing with sync
- 19:41 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:40 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:40 urbanecm@deploy2002: urbanecm: Backport for [Growth] Enable Add Link backend on a handful of wikis (T410469) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:37 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] Enable Add Link backend on a handful of wikis (T410469)
- 19:29 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:28 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 19:27 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker2005
- 19:27 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker2004
- 19:27 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker2005
- 19:27 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker2004
- 19:27 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:27 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-worker2004-5 to codfw - jhancock@cumin1003"
- 19:26 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-worker2004-5 to codfw - jhancock@cumin1003"
- 19:23 jhancock@cumin1003: START - Cookbook sre.dns.netbox
- 19:15 Krinkle: krinkle@deploy1002 sql --write wikifunctionswiki `UPDATE page SET page_touched='20251211191600' WHERE page_id=66102 LIMIT 1;`
- 18:51 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1217347 T410975 (duration: 04m 54s)
- 18:50 rzl@deploy2002: rzl: Continuing with sync
- 18:48 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1217347 T410975 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:47 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1217347 T410975
- 17:37 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 17:37 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 17:20 gehel@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1030.eqiad.wmnet with OS trixie
- 17:17 gehel@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS trixie
- 16:23 jhathaway: upload new package of corto via reprepro
- 15:47 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
- 15:43 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 15:43 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 15:43 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 15:41 gehel@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage
- 15:39 gehel@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1029.eqiad.wmnet with reason: host reimage
- 15:24 gehel@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1030.eqiad.wmnet with OS trixie
- 15:21 gehel@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS trixie
- 15:13 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1028.eqiad.wmnet with OS trixie
- 15:02 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:01 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:54 gehel@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1028.eqiad.wmnet with reason: host reimage
- 14:50 gehel@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1028.eqiad.wmnet with reason: host reimage
- 14:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:43 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 14:37 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 14:37 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 14:34 kemayo@deploy2002: Finished scap sync-world: Backport for Localisation updates from https://translatewiki.net. (duration: 10m 44s)
- 14:33 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
- 14:33 gehel@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1028.eqiad.wmnet with OS trixie
- 14:30 kemayo@deploy2002: kemayo: Continuing with sync
- 14:28 kemayo@deploy2002: kemayo: Backport for Localisation updates from https://translatewiki.net. synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:23 kemayo@deploy2002: Started scap sync-world: Backport for Localisation updates from https://translatewiki.net.
- 14:21 aude@deploy2002: Finished scap sync-world: Backport for [Legal Footer] Deploy Legal Footer for Phase 1 wikis (T410164) (duration: 11m 32s)
- 14:16 aude@deploy2002: lmora, aude: Continuing with sync
- 14:11 aude@deploy2002: lmora, aude: Backport for [Legal Footer] Deploy Legal Footer for Phase 1 wikis (T410164) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:09 aude@deploy2002: Started scap sync-world: Backport for [Legal Footer] Deploy Legal Footer for Phase 1 wikis (T410164)
- 14:08 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-sd1007.eqiad.wmnet with OS bookworm
- 14:08 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 14:07 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 14:07 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 14:07 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 14:05 mforns@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
- 14:05 mforns@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
- 14:05 mforns@deploy2002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
- 14:05 mforns@deploy2002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
- 14:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-sd1006.eqiad.wmnet with OS bookworm
- 14:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 14:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 14:03 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-sd1005.eqiad.wmnet with OS bookworm
- 14:03 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 14:00 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 13:58 mforns@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
- 13:58 mforns@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
- 13:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-sd1007.eqiad.wmnet with reason: host reimage
- 13:50 XioNoX: restart gnmic on netflow1002
- 13:47 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-sd1006.eqiad.wmnet with reason: host reimage
- 13:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-sd1005.eqiad.wmnet with reason: host reimage
- 13:43 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-sd1006.eqiad.wmnet with reason: host reimage
- 13:43 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-sd1007.eqiad.wmnet with reason: host reimage
- 13:40 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-sd1005.eqiad.wmnet with reason: host reimage
- 13:20 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
- 13:17 krinkle@deploy2002: Finished deploy [performance/navtiming@dde77b9]: Add temporary group for parsoid readviews (duration: 00m 16s)
- 13:17 krinkle@deploy2002: Started deploy [performance/navtiming@dde77b9]: Add temporary group for parsoid readviews
- 13:00 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host logging-sd1005.eqiad.wmnet with OS bookworm
- 13:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aqs1025.eqiad.wmnet with OS bullseye
- 13:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 12:59 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 12:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:54 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host logging-sd1006.eqiad.wmnet with OS bookworm
- 12:54 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host logging-sd1007.eqiad.wmnet with OS bookworm
- 12:53 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 12:53 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 12:52 jelto@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
- 12:52 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
- 12:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:48 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-sd1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-sd1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1025.eqiad.wmnet with reason: host reimage
- 12:40 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1025.eqiad.wmnet with reason: host reimage
- 12:36 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:34 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:33 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:30 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host aqs1025.eqiad.wmnet with OS bullseye
- 12:26 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aqs1025.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:18 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1025.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:17 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1025.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:16 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1025.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 12:06 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
- 11:57 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
- 11:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:43 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
- 11:34 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
- 11:05 jelto@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
- 11:01 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
- 10:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 10:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 10:37 jelto@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
- 10:37 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
- 10:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 10:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 10:00 topranks: revert esams transport load balancing
- 09:41 XioNoX: revert eqsin transport load balancing
- 09:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 09:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 03:21 eileen: civicrm upgraded from 41a460d5 to 1a5626c4
- 03:11 ejegg: fundraising python tools upgraded from 8e900e85 to c75f7625
- 01:23 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 22m 36s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:30 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 00:29 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 00:16 rzl: rzl@apt1002:~$ sudo -i reprepro copy trixie-wikimedia bullseye-wikimedia envoyproxy # T410975
- 00:16 rzl: rzl@apt1002:~$ sudo -i reprepro copy bookworm-wikimedia bullseye-wikimedia envoyproxy # T410975
- 00:16 rzl: rzl@apt1002:~$ sudo -i reprepro -C main includedeb bullseye-wikimedia /srv/wikimedia/pool/component/envoy-future/e/envoyproxy/envoyproxy_1.35.7-1_amd64.deb # T410975
2025-12-10
- 23:51 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 23:50 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 23:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 23:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 23:47 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 23:47 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 23:46 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 23:44 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 23:44 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 23:44 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 23:41 rzl: rzl@deploy2002:/srv/deployment-charts/helmfile.d/services/mw-debug$ helmfile -e codfw -i apply -l name=pinkunicorn --context=5 # T410975
- 23:40 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 23:40 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 23:36 rzl: rzl@deploy2002:/srv/deployment-charts/helmfile.d/services/mw-debug$ helmfile -e codfw -i apply -l name=pinkunicorn --set mesh.image_name=envoy-future --set mesh.image_version=1.35.7-1 --context=5 # T410975
- 23:35 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
- 23:35 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
- 23:30 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
- 23:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mathoid: apply
- 23:28 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
- 23:27 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
- 23:08 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
- 23:07 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mathoid: apply
- 23:04 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
- 23:04 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
- 22:58 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
- 22:57 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
- 22:22 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-jumbo2001.codfw.wmnet with OS trixie
- 22:22 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 22:19 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 22:15 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-jumbo2003.codfw.wmnet with OS trixie
- 22:15 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 22:14 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 22:11 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti-jumbo2002.codfw.wmnet with OS trixie
- 22:11 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 22:10 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 22:02 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti-jumbo2001.codfw.wmnet with reason: host reimage
- 21:58 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti-jumbo2003.codfw.wmnet with reason: host reimage
- 21:54 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti-jumbo2002.codfw.wmnet with reason: host reimage
- 21:50 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti-jumbo2003.codfw.wmnet with reason: host reimage
- 21:49 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti-jumbo2001.codfw.wmnet with reason: host reimage
- 21:49 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti-jumbo2002.codfw.wmnet with reason: host reimage
- 21:49 kemayo@deploy2002: Finished scap sync-world: Backport for Add experiment + tracking for mobile section switching (T410803), mobileSectionSwitch: action_context needs to be stringified (T410803) (duration: 09m 40s)
- 21:44 kemayo@deploy2002: kemayo: Continuing with sync
- 21:42 kemayo@deploy2002: kemayo: Backport for Add experiment + tracking for mobile section switching (T410803), mobileSectionSwitch: action_context needs to be stringified (T410803) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:39 kemayo@deploy2002: Started scap sync-world: Backport for Add experiment + tracking for mobile section switching (T410803), mobileSectionSwitch: action_context needs to be stringified (T410803)
- 21:38 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo2003.codfw.wmnet with OS trixie
- 21:38 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo2002.codfw.wmnet with OS trixie
- 21:38 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti-jumbo2001.codfw.wmnet with OS trixie
- 21:36 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:35 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:33 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-jumbo2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T410589)', diff saved to https://phabricator.wikimedia.org/P86509 and previous config saved to /var/cache/conftool/dbconfig/20251210-213235-ladsgroup.json
- 21:30 jsn@deploy2002: Finished scap sync-world: Backport for Enable revertrisk filters in thwiki (T409438) (duration: 08m 51s)
- 21:25 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:25 jsn@deploy2002: kgraessle, jsn: Continuing with sync
- 21:25 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:24 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs1014.eqiad.wmnet with reason: catching up on lag
- 21:23 jsn@deploy2002: kgraessle, jsn: Backport for Enable revertrisk filters in thwiki (T409438) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:22 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-jumbo2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 21:21 jsn@deploy2002: Started scap sync-world: Backport for Enable revertrisk filters in thwiki (T409438)
- 21:20 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti-jumbo2003
- 21:20 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti-jumbo2002
- 21:20 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti-jumbo2001
- 21:20 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti-jumbo2003
- 21:20 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti-jumbo2002
- 21:20 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti-jumbo2001
- 21:20 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:20 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ganeti-jumbo2001-3 to codfw - jhancock@cumin1003"
- 21:20 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ganeti-jumbo2001-3 to codfw - jhancock@cumin1003"
- 21:19 sbassett@deploy2002: Finished scap sync-world: Backport for Set CSP Report Only mode for group1 wikis (T291867) (duration: 10m 34s)
- 21:17 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P86508 and previous config saved to /var/cache/conftool/dbconfig/20251210-211728-ladsgroup.json
- 21:16 jhancock@cumin1003: START - Cookbook sre.dns.netbox
- 21:12 sbassett@deploy2002: sbassett: Continuing with sync
- 21:12 sbassett@deploy2002: sbassett: Backport for Set CSP Report Only mode for group1 wikis (T291867) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:08 sbassett@deploy2002: Started scap sync-world: Backport for Set CSP Report Only mode for group1 wikis (T291867)
- 21:06 larssandergreen: Updating civicrm from 764fa3a8 to 41a460d5
- 21:02 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P86507 and previous config saved to /var/cache/conftool/dbconfig/20251210-210220-ladsgroup.json
- 21:01 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aqs1026.eqiad.wmnet with OS bullseye
- 21:01 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:01 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aqs1024.eqiad.wmnet with OS bullseye
- 21:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aqs1027.eqiad.wmnet with OS bullseye
- 21:00 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:57 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:52 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:49 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aqs1023.eqiad.wmnet with OS bullseye
- 20:49 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:47 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T410589)', diff saved to https://phabricator.wikimedia.org/P86506 and previous config saved to /var/cache/conftool/dbconfig/20251210-204712-ladsgroup.json
- 20:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1026.eqiad.wmnet with reason: host reimage
- 20:41 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1027.eqiad.wmnet with reason: host reimage
- 20:37 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1024.eqiad.wmnet with reason: host reimage
- 20:33 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1026.eqiad.wmnet with reason: host reimage
- 20:33 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1027.eqiad.wmnet with reason: host reimage
- 20:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1023.eqiad.wmnet with reason: host reimage
- 20:33 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1024.eqiad.wmnet with reason: host reimage
- 20:31 ryankemper: [WDQS] `ryankemper@wdqs1014:~$ sudo systemctl restart wdqs-blazegraph` to unstick deadlock
- 20:30 urbanecm@deploy2002: Finished scap sync-world: test (duration: 76m 31s)
- 20:29 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1023.eqiad.wmnet with reason: host reimage
- 20:23 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host aqs1027.eqiad.wmnet with OS bullseye
- 20:23 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host aqs1026.eqiad.wmnet with OS bullseye
- 20:22 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host aqs1024.eqiad.wmnet with OS bullseye
- 20:18 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host aqs1023.eqiad.wmnet with OS bullseye
- 20:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aqs1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 20:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aqs1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 20:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aqs1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 20:13 topranks: Remove 2x40G LAGs between ssw1-d1-eqiad ssw1-d8-eqiad and asw2-c-eqiad asw2-d-eqiad
- 20:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 20:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 20:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 20:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1018.eqiad.wmnet with OS bullseye
- 19:53 eevans@deploy2002: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
- 19:52 eevans@deploy2002: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
- 19:52 eevans@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
- 19:51 eevans@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
- 19:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: host reimage
- 19:44 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1018.eqiad.wmnet with reason: host reimage
- 19:29 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1018.eqiad.wmnet with OS bullseye
- 19:14 urbanecm@deploy2002: Started scap sync-world: test
- 19:08 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: T411781
- 19:06 brett: stop pybal/puppet on lvs1018 (T411781)
- 19:03 topranks: disable BGP on cr1-eqiad and cr2-eqiad to lvs1018 to fail over to lvs1020 (T411781)
- 19:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
- 19:03 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
- 18:38 urbanecm@deploy2002: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.4,1.46.0-wmf.5,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/me
- 18:27 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-sd2005.codfw.wmnet with OS bookworm
- 18:26 jhancock@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 18:26 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-sd2007.codfw.wmnet with OS bookworm
- 18:26 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 18:06 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 17:48 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-sd2005.codfw.wmnet with reason: host reimage
- 17:46 urbanecm@deploy2002: Started scap sync-world: Backport for Confirmation email: further styling adjustments (T411526), i18n: replace <> to avoid false positive export errors
- 17:44 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-sd2005.codfw.wmnet with reason: host reimage
- 17:44 urbanecm@deploy2002: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.4,1.46.0-wmf.5,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/me
- 17:35 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 17:19 jgleeson: civicrm upgraded from 5a21fb9c to 764fa3a8
- 17:18 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-sd2006.codfw.wmnet with OS bookworm
- 17:18 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 17:17 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-sd2007.codfw.wmnet with reason: host reimage
- 17:16 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
- 17:13 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-sd2007.codfw.wmnet with reason: host reimage
- 17:05 larssandergreen: Updating civicrm from bdf84821 to 5a21fb9c
- 17:03 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-build1001.eqiad.wmnet with OS trixie
- 17:03 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-build1001.eqiad.wmnet with OS trixie
- 17:02 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host logging-sd2007.codfw.wmnet with OS bookworm
- 17:02 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host logging-sd2005.codfw.wmnet with OS bookworm
- 16:59 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-sd2006.codfw.wmnet with reason: host reimage
- 16:53 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-sd2006.codfw.wmnet with reason: host reimage
- 16:49 urbanecm@deploy2002: Started scap sync-world: Backport for Confirmation email: further styling adjustments (T411526), i18n: replace <> to avoid false positive export errors
- 16:48 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-build1001.eqiad.wmnet with reason: host reimage
- 16:43 dpogorzelski@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-build1001.eqiad.wmnet with reason: host reimage
- 16:42 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host logging-sd2006.codfw.wmnet with OS bookworm
- 16:39 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-sd2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:36 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-sd2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:31 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-sd2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:30 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:27 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:27 bking@dns1004: END - running authdns-update
- 16:26 bking@dns1004: START - running authdns-update
- 16:26 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-build1001.eqiad.wmnet with OS trixie
- 16:11 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:11 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2007
- 16:10 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2006
- 16:10 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2007
- 16:10 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2006
- 15:34 dreamyjazz@deploy2002: Finished scap sync-world: Backport for Define config for v2 of suggested investigations instrument (T409260) (duration: 06m 47s)
- 15:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aqs1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:30 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
- 15:29 dreamyjazz@deploy2002: dreamyjazz: Backport for Define config for v2 of suggested investigations instrument (T409260) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:27 dreamyjazz@deploy2002: Started scap sync-world: Backport for Define config for v2 of suggested investigations instrument (T409260)
- 15:25 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:25 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aqs1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:25 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:25 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:24 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:24 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:23 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd1005
- 15:23 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd1005
- 15:23 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:22 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd1007
- 15:22 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd1007
- 15:21 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd1006
- 15:21 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd1006
- 15:20 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:18 jclark@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:15 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:15 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:14 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:14 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:13 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:13 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:10 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:10 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:09 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:08 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:08 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:08 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:07 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:06 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 15:06 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:06 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:59 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1025.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:56 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:56 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aqs1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:56 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:56 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt aqs servers - jclark@cumin1003"
- 14:56 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt aqs servers - jclark@cumin1003"
- 14:56 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 14:56 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 14:52 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 14:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:51 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1025.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:51 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:51 jclark@cumin1003: START - Cookbook sre.hosts.provision for host aqs1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 14:49 Lucas_WMDE: UTC afternoon backport+config window done
- 14:49 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Set wgEnableWatchlistLabels for beta (T411836) (duration: 07m 21s)
- 14:45 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, samtar: Continuing with sync
- 14:44 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, samtar: Backport for Set wgEnableWatchlistLabels for beta (T411836) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:41 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Set wgEnableWatchlistLabels for beta (T411836)
- 14:26 arlolra@deploy2002: Finished scap sync-world: Backport for ExtensionDistributor: mark 1.45 as stable (T408482) (duration: 06m 29s)
- 14:22 arlolra@deploy2002: arlolra, macfan4000: Continuing with sync
- 14:22 arlolra@deploy2002: arlolra, macfan4000: Backport for ExtensionDistributor: mark 1.45 as stable (T408482) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:20 arlolra@deploy2002: Started scap sync-world: Backport for ExtensionDistributor: mark 1.45 as stable (T408482)
- 14:14 sbisson@deploy2002: Finished scap sync-world: Backport for CX3 Build 1.0.0+20251209 (T384485 T408845 T409332 T409337 T409338 T411779) (duration: 09m 01s)
- 14:10 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2228 (T410589)', diff saved to https://phabricator.wikimedia.org/P86501 and previous config saved to /var/cache/conftool/dbconfig/20251210-141046-ladsgroup.json
- 14:10 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
- 14:10 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T410589)', diff saved to https://phabricator.wikimedia.org/P86500 and previous config saved to /var/cache/conftool/dbconfig/20251210-141022-ladsgroup.json
- 14:08 sbisson@deploy2002: sbisson: Continuing with sync
- 14:07 sbisson@deploy2002: sbisson: Backport for CX3 Build 1.0.0+20251209 (T384485 T408845 T409332 T409337 T409338 T411779) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:05 sbisson@deploy2002: Started scap sync-world: Backport for CX3 Build 1.0.0+20251209 (T384485 T408845 T409332 T409337 T409338 T411779)
- 13:55 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P86499 and previous config saved to /var/cache/conftool/dbconfig/20251210-135514-ladsgroup.json
- 13:53 kart_: Updated Recommendation API to 2025-12-09-164214-production (T384485, T409338, T409332)
- 13:51 kartik@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:47 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:41 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:40 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P86497 and previous config saved to /var/cache/conftool/dbconfig/20251210-134007-ladsgroup.json
- 13:27 hnowlan@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
- 13:25 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T410589)', diff saved to https://phabricator.wikimedia.org/P86496 and previous config saved to /var/cache/conftool/dbconfig/20251210-132459-ladsgroup.json
- 13:20 hnowlan@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
- 12:53 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 12:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 11:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-analytics-test: apply
- 11:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-analytics-test: apply
- 11:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet
- 11:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1017.eqiad.wmnet
- 10:39 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-build1001.eqiad.wmnet with reason: host reimage
- 10:35 dpogorzelski@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-build1001.eqiad.wmnet with reason: host reimage
- 10:19 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-build1001.eqiad.wmnet with OS trixie
- 10:14 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from ml-lab1001 to ml-build1001
- 10:13 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ml-build1001
- 10:11 jelto@puppetserver1001: conftool action : set/pooled=no; selector: cluster=tcp-proxy,service=gerrit
- 10:11 dpogorzelski@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ml-build1001
- 10:11 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ml-build1001 on all recursors
- 10:11 dpogorzelski@cumin1003: START - Cookbook sre.dns.wipe-cache ml-build1001 on all recursors
- 10:11 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:11 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming ml-lab1001 to ml-build1001 - dpogorzelski@cumin1003"
- 10:10 dpogorzelski@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming ml-lab1001 to ml-build1001 - dpogorzelski@cumin1003"
- 10:04 dpogorzelski@cumin1003: START - Cookbook sre.dns.netbox
- 10:04 dpogorzelski@cumin1003: START - Cookbook sre.hosts.rename from ml-lab1001 to ml-build1001
- 10:01 jelto@puppetserver1001: conftool action : set/pooled=no; selector: cluster=tcp-proxy,service=gerrit,dc=drmrs
- 09:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 09:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 09:47 jelto@puppetserver1001: conftool action : set/pooled=no; selector: name=tcp-proxy6001.drmrs.wmnet
- 09:15 joal@deploy2002: Finished deploy [analytics/refinery@6e8f9d4] (thin): Regular analytics train THIN [analytics/refinery@6e8f9d4a] (duration: 01m 13s)
- 09:14 joal@deploy2002: Started deploy [analytics/refinery@6e8f9d4] (thin): Regular analytics train THIN [analytics/refinery@6e8f9d4a]
- 09:14 joal@deploy2002: Finished deploy [analytics/refinery@6e8f9d4]: Regular analytics train [analytics/refinery@6e8f9d4a] (duration: 02m 30s)
- 09:11 joal@deploy2002: Started deploy [analytics/refinery@6e8f9d4]: Regular analytics train [analytics/refinery@6e8f9d4a]
- 09:11 joal@deploy2002: Finished deploy [analytics/refinery@6e8f9d4] (hadoop-test): Regular analytics train TEST [analytics/refinery@6e8f9d4a] (duration: 01m 04s)
- 09:10 joal@deploy2002: Started deploy [analytics/refinery@6e8f9d4] (hadoop-test): Regular analytics train TEST [analytics/refinery@6e8f9d4a]
- 05:56 dpogorzelski@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-lab1001.eqiad.wmnet with OS trixie
- 05:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2223 (T410589)', diff saved to https://phabricator.wikimedia.org/P86492 and previous config saved to /var/cache/conftool/dbconfig/20251210-055138-ladsgroup.json
- 05:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
- 05:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T410589)', diff saved to https://phabricator.wikimedia.org/P86491 and previous config saved to /var/cache/conftool/dbconfig/20251210-055125-ladsgroup.json
- 05:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P86490 and previous config saved to /var/cache/conftool/dbconfig/20251210-053618-ladsgroup.json
- 05:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P86489 and previous config saved to /var/cache/conftool/dbconfig/20251210-052110-ladsgroup.json
- 05:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T410589)', diff saved to https://phabricator.wikimedia.org/P86488 and previous config saved to /var/cache/conftool/dbconfig/20251210-050603-ladsgroup.json
- 01:57 cstone: SmashPig upgraded from 1442d0a0 to 5c731f99
- 01:18 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 17m 50s)
- 01:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-09
- 23:28 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 23:28 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 23:27 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 23:27 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 23:26 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-sd2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 23:26 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 23:25 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 23:25 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host logging-sd2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2007
- 23:24 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2007
- 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2006
- 23:24 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2006
- 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2006
- 23:24 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2006
- 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2005
- 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2007
- 23:23 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logging-sd2006
- 23:23 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2007
- 23:23 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2006
- 23:23 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host logging-sd2005
- 23:23 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:23 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding logging-sd2005-7 to codfw - jhancock@cumin1003"
- 23:23 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding logging-sd2005-7 to codfw - jhancock@cumin1003"
- 23:19 jhancock@cumin1003: START - Cookbook sre.dns.netbox
- 22:28 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 02m 02s)
- 22:26 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 05m 59s)
- 22:07 jhathaway@dns1004: END - running authdns-update
- 22:06 jhathaway@dns1004: START - running authdns-update
- 22:01 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on wdqs[1028-1032].eqiad.wmnet with reason: T410406
- 21:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2213 (T410589)', diff saved to https://phabricator.wikimedia.org/P86487 and previous config saved to /var/cache/conftool/dbconfig/20251209-213205-ladsgroup.json
- 21:31 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2213.codfw.wmnet with reason: Maintenance
- 21:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T410589)', diff saved to https://phabricator.wikimedia.org/P86486 and previous config saved to /var/cache/conftool/dbconfig/20251209-213152-ladsgroup.json
- 21:26 catrope@deploy2002: Finished scap sync-world: Backport for [ukwiki] Limit thanks for newbies to 3 per hour (T411588), [enwikibooks] Allow sysops to revert abusefilter and grant/revoke some flags (T411828) (duration: 07m 56s)
- 21:24 taavi: run new CentralAuth:RecalculateGlobalEditCount.php on tokwiki
- 21:22 catrope@deploy2002: superpes, catrope: Continuing with sync
- 21:21 catrope@deploy2002: superpes, catrope: Backport for [ukwiki] Limit thanks for newbies to 3 per hour (T411588), [enwikibooks] Allow sysops to revert abusefilter and grant/revoke some flags (T411828) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:18 catrope@deploy2002: Started scap sync-world: Backport for [ukwiki] Limit thanks for newbies to 3 per hour (T411588), [enwikibooks] Allow sysops to revert abusefilter and grant/revoke some flags (T411828)
- 21:16 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P86485 and previous config saved to /var/cache/conftool/dbconfig/20251209-211644-ladsgroup.json
- 21:13 egardner@deploy2002: Finished scap sync-world: Backport for Backport: Instrument sticky header session length to 1.46.0-wmf.5 (T412146), Fix scroll-on-collapse (T411868 T411869), Fix heading background positioning (T412054) (duration: 08m 26s)
- 21:09 egardner@deploy2002: egardner, ksarabia: Continuing with sync
- 21:08 egardner@deploy2002: egardner, ksarabia: Backport for Backport: Instrument sticky header session length to 1.46.0-wmf.5 (T412146), Fix scroll-on-collapse (T411868 T411869), Fix heading background positioning (T412054) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:05 egardner@deploy2002: Started scap sync-world: Backport for Backport: Instrument sticky header session length to 1.46.0-wmf.5 (T412146), Fix scroll-on-collapse (T411868 T411869), Fix heading background positioning (T412054)
- 21:01 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P86484 and previous config saved to /var/cache/conftool/dbconfig/20251209-210136-ladsgroup.json
- 20:46 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T410589)', diff saved to https://phabricator.wikimedia.org/P86483 and previous config saved to /var/cache/conftool/dbconfig/20251209-204628-ladsgroup.json
- 20:11 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2009.codfw.wmnet with OS trixie
- 19:54 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
- 19:48 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
- 19:20 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
- 18:47 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-e1-codfw.mgmt,ssw1-f1-codfw.mgmt with reason: upgradiing sr-linux on Nokia switches codfw
- 18:32 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 17 hosts with reason: upgradiing sr-linux on Nokia switches codfw
- 18:25 dzahn@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 18:24 dzahn@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 18:24 dzahn@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 18:24 dzahn@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 18:23 dzahn@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 18:22 dzahn@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 18:22 dzahn@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 18:21 dzahn@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 17:19 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 17:19 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 16:39 brett@dns1006: END - running authdns-update
- 16:38 brett@dns1006: START - running authdns-update
- 16:20 brett: Remove varnishkafka from trixie-wikimedia - T401832
- 15:47 cdanis@dns3003: END - running authdns-update
- 15:45 cdanis@dns3003: START - running authdns-update
- 15:22 Lucas_WMDE: UTC afternoon backport+config window done
- 15:19 sbisson@deploy2002: Finished scap sync-world: Backport for Article search: surface nominated collections (JSON files) (T408842) (duration: 69m 26s)
- 15:15 vgutierrez: restarting ATS on cp3074
- 15:06 sbisson@deploy2002: sbisson: Continuing with sync
- 15:05 sbisson@deploy2002: sbisson: Backport for Article search: surface nominated collections (JSON files) (T408842) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1229 gradually with 4 steps - Pooling in after cloning
- 14:09 sbisson@deploy2002: Started scap sync-world: Backport for Article search: surface nominated collections (JSON files) (T408842)
- 14:08 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-lab1001.eqiad.wmnet with reason: host reimage
- 14:04 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 14:02 dpogorzelski@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-lab1001.eqiad.wmnet with reason: host reimage
- 13:53 gehel: sudo cumin 'A:lvs-low-traffic-eqiad' 'systemctl restart pybal.service' - T406222
- 13:48 gehel: sudo cumin 'A:lvs-secondary-eqiad' 'systemctl restart pybal.service' - T406222
- 13:48 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
- 13:47 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
- 13:47 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
- 13:47 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS trixie
- 13:47 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
- 13:45 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 13:45 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 13:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool db1229 gradually with 4 steps - Pooling in after cloning
- 13:19 dpogorzelski@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-lab1001.eqiad.wmnet with OS trixie
- 13:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (T410589)', diff saved to https://phabricator.wikimedia.org/P86471 and previous config saved to /var/cache/conftool/dbconfig/20251209-130640-ladsgroup.json
- 13:06 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
- 13:04 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS trixie
- 13:03 dpogorzelski@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-lab1001.eqiad.wmnet with OS trixie
- 12:30 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS trixie
- 12:09 dpogorzelski@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-lab1001.eqiad.wmnet with OS trixie
- 10:58 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS trixie
- 10:57 dpogorzelski@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-lab1001.eqiad.wmnet with OS trixie
- 10:38 XioNoX: set port-speed on disabled Nokia interface
- 10:30 dpogorzelski@cumin1003: START - Cookbook sre.hosts.reimage for host ml-lab1001.eqiad.wmnet with OS trixie
- 10:03 dpogorzelski@cumin1003: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from ml-lab1001 to ml-build1001
- 10:03 dpogorzelski@cumin1003: START - Cookbook sre.hosts.rename from ml-lab1001 to ml-build1001
- 09:53 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=ml-serve1013.eqiad.wmnet
- 09:47 elukey@puppetserver1001: conftool action : set/pooled=true:weight=10; selector: name=ml-serve1013.eqiad.wmnet
- 08:46 matthiasmullie: UTC morning backports done
- 08:42 mlitn@deploy2002: Finished scap sync-world: Backport for Squashed diff to master (duration: 07m 34s)
- 08:38 mlitn@deploy2002: mlitn: Continuing with sync
- 08:36 mlitn@deploy2002: mlitn: Backport for Squashed diff to master synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:34 mlitn@deploy2002: Started scap sync-world: Backport for Squashed diff to master
- 08:30 wmde-fisch@deploy2002: Finished scap sync-world: Backport for ext.wikimediaEvents: Add xLab impactTest experiment-specific instrument (T407570), VE: Don't create a synth ref when there's a LDR main ref (T411245) (duration: 08m 56s)
- 08:26 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 08:26 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 08:25 wmde-fisch@deploy2002: wmde-fisch, sfaci: Continuing with sync
- 08:23 wmde-fisch@deploy2002: wmde-fisch, sfaci: Backport for ext.wikimediaEvents: Add xLab impactTest experiment-specific instrument (T407570), VE: Don't create a synth ref when there's a LDR main ref (T411245) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:21 wmde-fisch@deploy2002: Started scap sync-world: Backport for ext.wikimediaEvents: Add xLab impactTest experiment-specific instrument (T407570), VE: Don't create a synth ref when there's a LDR main ref (T411245)
- 05:48 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2201.codfw.wmnet with reason: Maintenance
- 05:48 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T410589)', diff saved to https://phabricator.wikimedia.org/P86465 and previous config saved to /var/cache/conftool/dbconfig/20251209-054822-ladsgroup.json
- 05:33 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P86464 and previous config saved to /var/cache/conftool/dbconfig/20251209-053314-ladsgroup.json
- 05:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P86463 and previous config saved to /var/cache/conftool/dbconfig/20251209-051806-ladsgroup.json
- 05:04 eileen: civicrm upgraded from e0867392 to bdf84821
- 05:03 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T410589)', diff saved to https://phabricator.wikimedia.org/P86462 and previous config saved to /var/cache/conftool/dbconfig/20251209-050258-ladsgroup.json
- 05:02 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.3 (duration: 02m 44s)
- 04:06 eileen: civicrm upgraded from f66aaff7 to e0867392
- 02:19 eileen: civicrm upgraded from 86784b37 to f66aaff7
- 01:18 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 17m 40s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:21 eileen: civicrm upgraded from 2dfecb38 to 86784b37
2025-12-08
- 23:33 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/echoserver: apply
- 23:33 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/echoserver: apply
- 22:24 ryankemper: `ryankemper@wdqs1015:~$ sudo systemctl restart wdqs-blazegraph`
- 22:23 eileen: civicrm upgraded from 64272a7d to 2dfecb38
- 22:23 urbanecm@deploy2002: Finished scap sync-world: Backport for Add i18n for edit full page button (duration: 46m 55s)
- 22:17 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/echoserver: apply
- 22:16 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/echoserver: apply
- 22:10 urbanecm@deploy2002: kemayo, urbanecm: Continuing with sync
- 22:10 urbanecm@deploy2002: kemayo, urbanecm: Backport for Add i18n for edit full page button synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:49 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:49 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:36 urbanecm@deploy2002: Started scap sync-world: Backport for Add i18n for edit full page button
- 21:35 urbanecm@deploy2002: mwscript-k8s job started: ORES:PopulateDatabase --wiki=thwiki # T409438
- 21:34 urbanecm@deploy2002: Finished scap sync-world: Backport for Enable revertrisk filters in thwiki (T409438) (duration: 09m 11s)
- 21:30 urbanecm@deploy2002: urbanecm, kgraessle: Continuing with sync
- 21:27 urbanecm@deploy2002: urbanecm, kgraessle: Backport for Enable revertrisk filters in thwiki (T409438) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:25 urbanecm@deploy2002: Started scap sync-world: Backport for Enable revertrisk filters in thwiki (T409438)
- 21:24 urbanecm@deploy2002: Finished scap sync-world: Backport for Add instrumentation for mobile section switching (T410319), Edit full page: Tweak skeleton appearance and fix scroll offsets, Set full page scroll to 130px (T411669), Ensure images are fixed size on mobile while loading (T411669) (duration: 07m 36s)
- 21:20 urbanecm@deploy2002: kemayo, urbanecm: Continuing with sync
- 21:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2178 (T410589)', diff saved to https://phabricator.wikimedia.org/P86460 and previous config saved to /var/cache/conftool/dbconfig/20251208-212004-ladsgroup.json
- 21:19 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 21:19 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T410589)', diff saved to https://phabricator.wikimedia.org/P86459 and previous config saved to /var/cache/conftool/dbconfig/20251208-211940-ladsgroup.json
- 21:18 urbanecm@deploy2002: kemayo, urbanecm: Backport for Add instrumentation for mobile section switching (T410319), Edit full page: Tweak skeleton appearance and fix scroll offsets, Set full page scroll to 130px (T411669), Ensure images are fixed size on mobile while loading (T411669) synced to the testservers (see https://wikitech.wikimedia.org/w
- 21:16 urbanecm@deploy2002: Started scap sync-world: Backport for Add instrumentation for mobile section switching (T410319), Edit full page: Tweak skeleton appearance and fix scroll offsets, Set full page scroll to 130px (T411669), Ensure images are fixed size on mobile while loading (T411669)
- 21:13 urbanecm@deploy2002: Finished scap sync-world: Backport for Partially undeploy 2025 Global Readers Survey (T410918), DiscussionTools: turn on automatic topic subscriptions for all editors (T290778) (duration: 08m 40s)
- 21:09 urbanecm@deploy2002: dani, urbanecm, kemayo: Continuing with sync
- 21:07 urbanecm@deploy2002: dani, urbanecm, kemayo: Backport for Partially undeploy 2025 Global Readers Survey (T410918), DiscussionTools: turn on automatic topic subscriptions for all editors (T290778) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:05 urbanecm@deploy2002: Started scap sync-world: Backport for Partially undeploy 2025 Global Readers Survey (T410918), DiscussionTools: turn on automatic topic subscriptions for all editors (T290778)
- 21:04 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P86458 and previous config saved to /var/cache/conftool/dbconfig/20251208-210432-ladsgroup.json
- 21:01 eileen: civicrm upgraded from dd7909ba to 64272a7d
- 20:49 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P86457 and previous config saved to /var/cache/conftool/dbconfig/20251208-204924-ladsgroup.json
- 20:34 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T410589)', diff saved to https://phabricator.wikimedia.org/P86456 and previous config saved to /var/cache/conftool/dbconfig/20251208-203417-ladsgroup.json
- 20:19 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:revalidateLinkRecommendations.php --wiki=itwiki --exceptDatasetChecksums=valid_itwiki_checksums.txt --deleteNullRecommendations --verbose # T412040
- 20:18 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:revalidateLinkRecommendations.php --wiki=itwiki --exceptDatasetChecksums=valid_itwiki_checksums.txt --deleteNullRecommendations # T412040
- 19:46 eileen: civicrm upgraded from 9ba062e3 to dd7909ba
- 19:30 jgleeson: payments-wiki upgraded from cb838b97 to 4767f4d5
- 17:42 urbanecm@deploy2002: Finished scap sync-world: Backport for Move mustache templates from includes (T409057), Adjust styling of confirmation emails (T411526) (duration: 10m 28s)
- 17:36 urbanecm@deploy2002: urbanecm: Continuing with sync
- 17:33 urbanecm@deploy2002: urbanecm: Backport for Move mustache templates from includes (T409057), Adjust styling of confirmation emails (T411526) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:31 urbanecm@deploy2002: Started scap sync-world: Backport for Move mustache templates from includes (T409057), Adjust styling of confirmation emails (T411526)
- 16:48 jgleeson: payments-wiki upgraded from 5c381b45 to cb838b97
- 15:39 urbanecm@deploy2002: mwscript-k8s job started: namespaceDupes.php --wiki=arwiktionary # T411819
- 15:38 urbanecm@deploy2002: mwscript-k8s job started: namespaceDupes.php --wiki=arwiktionary --fix # T411819
- 15:36 urbanecm@deploy2002: mwscript-k8s job started: namespaceDupes.php --wiki=arwiktionary # T411819
- 15:35 urbanecm@deploy2002: Finished scap sync-world: Backport for [config] arwiktionary: add 2 namespaces with talks (T411819) (duration: 10m 42s)
- 15:30 urbanecm@deploy2002: hubaishan, urbanecm: Continuing with sync
- 15:28 urbanecm@deploy2002: mwscript-k8s job started: namespaceDupes.php --wiki=niawiktionary # T411850
- 15:27 urbanecm@deploy2002: mwscript-k8s job started: namespaceDupes.php --wiki=shnwiki --fix # T411965
- 15:26 urbanecm@deploy2002: hubaishan, urbanecm: Backport for [config] arwiktionary: add 2 namespaces with talks (T411819) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:25 urbanecm@deploy2002: mwscript-k8s job started: namespaceDupes.php --wiki=shnwiki
- 15:24 urbanecm@deploy2002: Started scap sync-world: Backport for [config] arwiktionary: add 2 namespaces with talks (T411819)
- 15:22 urbanecm@deploy2002: Finished scap sync-world: Backport for SVG: do not allow native SVG rendering (T406023), enwikibooks: Limit FlaggedRevs to specific namespaces; disable FR stable-transclusion-checking (T408110 T410330) (duration: 08m 49s)
- 15:18 urbanecm@deploy2002: urbanecm, asmartkitten, hartman: Continuing with sync
- 15:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 15:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 15:15 urbanecm@deploy2002: urbanecm, asmartkitten, hartman: Backport for SVG: do not allow native SVG rendering (T406023), enwikibooks: Limit FlaggedRevs to specific namespaces; disable FR stable-transclusion-checking (T408110 T410330) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:13 urbanecm@deploy2002: Started scap sync-world: Backport for SVG: do not allow native SVG rendering (T406023), enwikibooks: Limit FlaggedRevs to specific namespaces; disable FR stable-transclusion-checking (T408110 T410330)
- 15:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 15:12 urbanecm@deploy2002: Finished scap sync-world: Backport for niawiktionary: update wordmark, sitename and projectnamespace (T411850), shnwiki: add draft namespace (T411965), [Growth]:Remove GELevelingUpNewNotificationsEnabled config (T407431) (duration: 08m 49s)
- 15:12 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 15:08 urbanecm@deploy2002: cyndywikime, urbanecm, anzx: Continuing with sync
- 15:06 urbanecm@deploy2002: cyndywikime, urbanecm, anzx: Backport for niawiktionary: update wordmark, sitename and projectnamespace (T411850), shnwiki: add draft namespace (T411965), [Growth]:Remove GELevelingUpNewNotificationsEnabled config (T407431) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:04 urbanecm@deploy2002: Started scap sync-world: Backport for niawiktionary: update wordmark, sitename and projectnamespace (T411850), shnwiki: add draft namespace (T411965), [Growth]:Remove GELevelingUpNewNotificationsEnabled config (T407431)
- 14:53 derick@deploy2002: Finished scap sync-world: Backport for Add Serbian Latin draft namespace and talk namespace aliases (T411750) (duration: 08m 08s)
- 14:49 derick@deploy2002: derick, zoranzoki21: Continuing with sync
- 14:47 derick@deploy2002: derick, zoranzoki21: Backport for Add Serbian Latin draft namespace and talk namespace aliases (T411750) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:45 derick@deploy2002: Started scap sync-world: Backport for Add Serbian Latin draft namespace and talk namespace aliases (T411750)
- 14:34 derick@deploy2002: Finished scap sync-world: Backport for Pass an explicit performer when attempting CreateLocalAccount (T411826 T411952) (duration: 07m 04s)
- 14:30 derick@deploy2002: d3r1ck01, derick: Continuing with sync
- 14:29 derick@deploy2002: d3r1ck01, derick: Backport for Pass an explicit performer when attempting CreateLocalAccount (T411826 T411952) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:27 derick@deploy2002: Started scap sync-world: Backport for Pass an explicit performer when attempting CreateLocalAccount (T411826 T411952)
- 14:25 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 14:25 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 12:53 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 11:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2171 (T410589)', diff saved to https://phabricator.wikimedia.org/P86451 and previous config saved to /var/cache/conftool/dbconfig/20251208-113911-ladsgroup.json
- 11:39 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 11:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T410589)', diff saved to https://phabricator.wikimedia.org/P86450 and previous config saved to /var/cache/conftool/dbconfig/20251208-113848-ladsgroup.json
- 11:23 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P86449 and previous config saved to /var/cache/conftool/dbconfig/20251208-112340-ladsgroup.json
- 11:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P86448 and previous config saved to /var/cache/conftool/dbconfig/20251208-110832-ladsgroup.json
- 10:53 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T410589)', diff saved to https://phabricator.wikimedia.org/P86447 and previous config saved to /var/cache/conftool/dbconfig/20251208-105325-ladsgroup.json
- 09:48 gehel: restarting Blazegraph on wdqs1015 - allocator decreasing - https://grafana.wikimedia.org/goto/Jygg2zMvg?orgId=1
- 09:14 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] Enable Add Link backend on a handful of wikis (T410469), [Growth] Sort the list of Add Link wikis alphabetically (T410469) (duration: 10m 01s)
- 09:08 urbanecm@deploy2002: urbanecm: Continuing with sync
- 09:08 hashar@deploy2002: Finished deploy [integration/docroot@41d63f3]: build: Updating eslint-config-wikimedia to 0.32.3 (duration: 00m 11s)
- 09:07 hashar@deploy2002: Started deploy [integration/docroot@41d63f3]: build: Updating eslint-config-wikimedia to 0.32.3
- 09:06 urbanecm@deploy2002: urbanecm: Backport for [Growth] Enable Add Link backend on a handful of wikis (T410469), [Growth] Sort the list of Add Link wikis alphabetically (T410469) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:04 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] Enable Add Link backend on a handful of wikis (T410469), [Growth] Sort the list of Add Link wikis alphabetically (T410469)
- 08:47 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Switch enwiki to 99.9% passive mode (T405586) (duration: 39m 54s)
- 08:33 kharlan@deploy2002: kharlan: Continuing with sync
- 08:30 kharlan@deploy2002: kharlan: Backport for hCaptcha: Switch enwiki to 99.9% passive mode (T405586) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:07 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Switch enwiki to 99.9% passive mode (T405586)
- 02:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T410589)', diff saved to https://phabricator.wikimedia.org/P86444 and previous config saved to /var/cache/conftool/dbconfig/20251208-020757-ladsgroup.json
- 02:07 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 07s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-07
- 21:49 eileen: civicrm upgraded from 9cc43ebd to 9ba062e3
- 18:36 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 17:20 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
- 17:20 elukey@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: sync
- 11:51 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
- 11:51 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: sync
- 02:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 02:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T410589)', diff saved to https://phabricator.wikimedia.org/P86442 and previous config saved to /var/cache/conftool/dbconfig/20251207-025120-ladsgroup.json
- 02:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P86441 and previous config saved to /var/cache/conftool/dbconfig/20251207-023613-ladsgroup.json
- 02:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P86440 and previous config saved to /var/cache/conftool/dbconfig/20251207-022105-ladsgroup.json
- 02:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T410589)', diff saved to https://phabricator.wikimedia.org/P86439 and previous config saved to /var/cache/conftool/dbconfig/20251207-020558-ladsgroup.json
- 01:18 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 17m 48s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-06
- 14:47 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T410589)', diff saved to https://phabricator.wikimedia.org/P86436 and previous config saved to /var/cache/conftool/dbconfig/20251206-144719-ladsgroup.json
- 14:47 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 03:47 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 03:47 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T410589)', diff saved to https://phabricator.wikimedia.org/P86435 and previous config saved to /var/cache/conftool/dbconfig/20251206-034700-ladsgroup.json
- 03:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P86434 and previous config saved to /var/cache/conftool/dbconfig/20251206-033152-ladsgroup.json
- 03:16 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P86433 and previous config saved to /var/cache/conftool/dbconfig/20251206-031644-ladsgroup.json
- 03:01 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T410589)', diff saved to https://phabricator.wikimedia.org/P86432 and previous config saved to /var/cache/conftool/dbconfig/20251206-030136-ladsgroup.json
- 01:18 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 17m 22s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-05
- 22:35 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 22:34 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 22:34 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 22:33 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 22:32 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 22:31 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 22:29 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 22:29 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 22:11 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 22:11 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 22:11 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 22:10 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 21:57 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 21:56 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 21:49 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 21:38 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 21:19 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 21:18 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 21:03 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
- 21:03 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
- 20:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
- 20:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
- 20:06 ejegg: donorwiki upgraded from 9ab44e85 to bbd96c00
- 19:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 19:49 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 19:10 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 19:09 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 18:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 18:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 18:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 18:17 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 18:10 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 18:10 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 17:23 topranks: add updated ssh firewall filter config to pfw1-eqiad.wikimedia.org T390939
- 17:11 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:07 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.provision (exit_code=97) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:02 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 17:02 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:52 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-lab1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 16:03 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 16:03 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 15:30 Amir1: creating ores tables on thwiki (T409438)
- 15:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1189 (T410589)', diff saved to https://phabricator.wikimedia.org/P86429 and previous config saved to /var/cache/conftool/dbconfig/20251205-150737-ladsgroup.json
- 15:07 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 15:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T410589)', diff saved to https://phabricator.wikimedia.org/P86428 and previous config saved to /var/cache/conftool/dbconfig/20251205-150713-ladsgroup.json
- 14:56 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 14:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 14:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 14:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P86427 and previous config saved to /var/cache/conftool/dbconfig/20251205-145206-ladsgroup.json
- 14:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 14:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 14:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 14:46 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 14:45 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 14:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P86426 and previous config saved to /var/cache/conftool/dbconfig/20251205-143658-ladsgroup.json
- 14:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 14:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 14:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/ferretdb-growthbook: apply
- 14:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/ferretdb-growthbook: apply
- 14:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook: apply
- 14:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook: apply
- 14:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T410589)', diff saved to https://phabricator.wikimedia.org/P86425 and previous config saved to /var/cache/conftool/dbconfig/20251205-142150-ladsgroup.json
- 14:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 14:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 14:08 jayme: stopped puppet on wikikube-ctrl2* and restarted kube-apiserver to temporarily extend audit logging
- 13:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 13:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 13:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 13:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 13:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/ferretdb-growthbook-next: apply
- 13:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/ferretdb-growthbook-next: apply
- 13:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
- 13:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
- 13:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
- 13:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
- 13:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/ferretdb-growthbook: apply
- 13:43 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/ferretdb-growthbook: apply
- 13:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook: apply
- 13:33 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
- 13:30 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
- 13:10 moritzm: upload python3-sshpubkeys to 3.3.1-1~wmf12u1 to apt.wikimedia.org T411816
- 12:42 moritzm: upgrade python3-sshpubkeys on idm-test1001 to 3.3.1-1~wmf12u1 T411816
- 12:30 jayme: removed helm release mw-script/utk6lsuw in k8s@codfw which was in stuck in pending-install state since 9+ days
- 11:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 11:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 11:42 lucaswerkmeister-wmde@deploy2002: kubectl delete job wikidata-resubmit-changes-for-dispatch-29415459 # T411862
- 11:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1233.eqiad.wmnet onto db1229.eqiad.wmnet
- 11:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1233 gradually with 4 steps - Pool db1233.eqiad.wmnet in after cloning
- 10:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool db1233 gradually with 4 steps - Pool db1233.eqiad.wmnet in after cloning
- 10:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 10:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 09:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 09:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1233 - Depool db1233.eqiad.wmnet to then clone it to db1229.eqiad.wmnet - fceratto@cumin1003
- 09:16 fceratto@cumin1003: START - Cookbook sre.mysql.depool db1233 - Depool db1233.eqiad.wmnet to then clone it to db1229.eqiad.wmnet - fceratto@cumin1003
- 09:16 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db1233.eqiad.wmnet onto db1229.eqiad.wmnet
- 08:04 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 08:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 08:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/ferretdb-growthbook-next: apply
- 08:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/ferretdb-growthbook-next: apply
- 08:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
- 08:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
- 07:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
- 07:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-growthbook-next: apply
- 03:24 larssandergreen: Updating civicrm from 7a979750 to 9cc43ebd
- 03:08 larssandergreen: Updating civicrm from 36b09796 to 7a979750
- 02:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T410589)', diff saved to https://phabricator.wikimedia.org/P86417 and previous config saved to /var/cache/conftool/dbconfig/20251205-025711-ladsgroup.json
- 02:57 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 02:56 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T410589)', diff saved to https://phabricator.wikimedia.org/P86416 and previous config saved to /var/cache/conftool/dbconfig/20251205-025647-ladsgroup.json
- 02:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P86415 and previous config saved to /var/cache/conftool/dbconfig/20251205-024139-ladsgroup.json
- 02:40 ejegg: payments-wiki upgraded from 9ab44e85 to 5c381b45
- 02:26 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P86414 and previous config saved to /var/cache/conftool/dbconfig/20251205-022631-ladsgroup.json
- 02:11 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T410589)', diff saved to https://phabricator.wikimedia.org/P86413 and previous config saved to /var/cache/conftool/dbconfig/20251205-021123-ladsgroup.json
- 02:09 wfan: donorwiki upgraded from 053b3f88 to 9ab44e85
- 02:07 wfan: payments-wiki upgraded from d2799b95 to 9ab44e85
- 02:01 rzl: rzl@apt1002:~$ sudo -i reprepro -C component/envoy-future include bullseye-wikimedia /home/rzl/envoyproxy_1.35.7-1_amd64.changes
- 01:44 wfan: SmashPig upgraded from a25fbb28 to 1442d0a0
- 01:41 eileen: civicrm upgraded from d4bd9b1b to 36b09796
- 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 06s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:27 Amir1: ladsgroup@deploy2002:~$ mwscript-k8s --follow -- findBadBlobs.php --wiki huwikiquote --mark "Corrupted UTF-8 (T351953)" --revisions 3804,3808,3811,3813,3814,3818,3825
- 00:26 Amir1: ladsgroup@deploy2002:~$ mwscript-k8s --follow -- findBadBlobs.php --wiki guwiktionary --mark "Corrupted UTF-8 (T351953)" --revisions 20576
2025-12-04
- 23:47 tzatziki: removing 4 files for legal compliance
- 23:34 tzatziki: removing 2 files for legal compliance
- 23:23 tzatziki: removing 3 files for legal compliance
- 23:16 ryankemper@cumin2002: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop test cluster
- 23:16 tzatziki: removing 5 files for legal compliance
- 23:04 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7002.magru.wmnet
- 23:02 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7001.magru.wmnet
- 23:00 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7002.magru.wmnet
- 23:00 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6002.drmrs.wmnet
- 22:59 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6001.drmrs.wmnet
- 22:59 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5002.eqsin.wmnet
- 22:58 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7001.magru.wmnet
- 22:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5001.eqsin.wmnet
- 22:56 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6002.drmrs.wmnet
- 22:55 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6001.drmrs.wmnet
- 22:55 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5002.eqsin.wmnet
- 22:55 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3002.esams.wmnet
- 22:55 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4002.ulsfo.wmnet
- 22:52 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3001.esams.wmnet
- 22:52 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5001.eqsin.wmnet
- 22:51 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2002.codfw.wmnet
- 22:51 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4002.ulsfo.wmnet
- 22:51 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3002.esams.wmnet
- 22:51 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4001.ulsfo.wmnet
- 22:51 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2001.codfw.wmnet
- 22:50 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4001.ulsfo.wmnet
- 22:50 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1002.eqiad.wmnet
- 22:49 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3001.esams.wmnet
- 22:48 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2002.codfw.wmnet
- 22:47 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2001.codfw.wmnet
- 22:46 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1002.eqiad.wmnet
- 22:42 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
- 22:42 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-cluster
- 22:37 sbassett: Deployed security fix for T409226
- 22:35 ryankemper@cumin2002: START - Cookbook sre.hadoop.reboot-workers for Hadoop test cluster
- 22:28 sbassett: Deployed security fix for T408135
- 22:22 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: T408532
- 22:20 ryankemper: T411568 Rebooting `stat*`
- 22:11 ryankemper@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on stat[1008-1011].eqiad.wmnet with reason: T411568
- 22:06 cscott@deploy2002: Finished scap sync-world: Backport for Activate postprocessing cache on testwiki, test2wiki, officewiki (T348255) (duration: 14m 23s)
- 22:02 cscott@deploy2002: ihurbain, cscott: Continuing with sync
- 21:54 cscott@deploy2002: ihurbain, cscott: Backport for Activate postprocessing cache on testwiki, test2wiki, officewiki (T348255) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:52 cscott@deploy2002: Started scap sync-world: Backport for Activate postprocessing cache on testwiki, test2wiki, officewiki (T348255)
- 21:45 jforrester@deploy2002: Finished scap sync-world: Backport for Followup Ie40b9e59a4: Fortify unified metrics method (T411793) (duration: 07m 16s)
- 21:40 jforrester@deploy2002: jforrester: Continuing with sync
- 21:40 jforrester@deploy2002: jforrester: Backport for Followup Ie40b9e59a4: Fortify unified metrics method (T411793) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:37 jforrester@deploy2002: Started scap sync-world: Backport for Followup Ie40b9e59a4: Fortify unified metrics method (T411793)
- 21:24 jforrester@deploy2002: Finished scap sync-world: Backport for [tokwiki] Allow sysops to grant/remove confirmed status (T411683), OATHAuth: Remove wmgOATHAuthDisableRight (T399664), Remove /data-parsoid/ endpoint from specs per T393557 (T411517), Shorten 'close' cookie wait period for enwiki banners (T411800) (duration: 10m 04s)
- 21:19 jforrester@deploy2002: mstyles, aaron, superpes, jforrester, ejegg: Continuing with sync
- 21:18 jforrester@deploy2002: mstyles, aaron, superpes, jforrester, ejegg: Backport for [tokwiki] Allow sysops to grant/remove confirmed status (T411683), OATHAuth: Remove wmgOATHAuthDisableRight (T399664), Remove /data-parsoid/ endpoint from specs per T393557 (T411517), Shorten 'close' cookie wait period for enwiki banners (T411800) synced to the t
- 21:14 jforrester@deploy2002: Started scap sync-world: Backport for [tokwiki] Allow sysops to grant/remove confirmed status (T411683), OATHAuth: Remove wmgOATHAuthDisableRight (T399664), Remove /data-parsoid/ endpoint from specs per T393557 (T411517), Shorten 'close' cookie wait period for enwiki banners (T411800)
- 21:11 kharlan@deploy2002: Finished scap sync-world: Backport for Use a separate right for Special:SuggestedInvestigations (T411557) (duration: 57m 45s)
- 21:03 brett: import varnishkafka 1.2.0~deb13+wmf1 into trixie-wikimedia - T401832
- 21:01 taavi@deploy2002: mwscript-k8s job started: initEditCount --wiki=tokwiki
- 20:58 kharlan@deploy2002: kharlan: Continuing with sync
- 20:57 kharlan@deploy2002: kharlan: Backport for Use a separate right for Special:SuggestedInvestigations (T411557) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:50 brett: import libvmod-wmfuniq 0.2.0~deb13+wmf1 into trixie-wikimedia - T401832
- 20:28 brett: Delete libvmod-netmapper 1.10-1~deb13+wmf1, import libvmod-netmapper 1.10~deb13+wmf1 into trixie-wikimedia - T401832
- 20:13 kharlan@deploy2002: Started scap sync-world: Backport for Use a separate right for Special:SuggestedInvestigations (T411557)
- 20:13 brett: import libvmod-querysort 0.4~deb13+wmf1 into trixie-wikimedia - T401832
- 20:05 cstone: payments-wiki upgraded from 714ed4cf to d2799b95
- 20:00 brett: import libvmod-netmapper 1.10-1~deb13+wmf1 into trixie-wikimedia - T401832
- 19:30 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Persist the captcha consequence in the user session (T410657) (duration: 11m 16s)
- 19:24 kharlan@deploy2002: kharlan: Continuing with sync
- 19:21 kharlan@deploy2002: kharlan: Backport for hCaptcha: Persist the captcha consequence in the user session (T410657) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:19 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Persist the captcha consequence in the user session (T410657)
- 19:13 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
- 19:12 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
- 18:50 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
- 18:50 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
- 18:46 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
- 18:45 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
- 18:22 ejegg: fundraising civicrm rolled back from 510ab862 to d4bd9b1b
- 18:21 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
- 18:21 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
- 18:09 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 18:09 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 18:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1019.eqiad.wmnet with OS bullseye
- 17:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: host reimage
- 17:45 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1019.eqiad.wmnet with reason: host reimage
- 17:44 ejegg: fundraising civicrm upgraded from d4bd9b1b to 510ab862
- 17:30 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1019.eqiad.wmnet with OS bullseye
- 17:21 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host franio1004
- 17:21 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host franio1004
- 17:20 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:17 vriley@cumin1003: START - Cookbook sre.dns.netbox
- 17:06 topranks: disable BGP to lvs1019 on eqiad coure routers ahead of switch migration T405628
- 17:06 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: move primary uplink from move primary uplink from asw2-c7-eqiad to lsw1-c7-eqiad and remove link to asw2-d2-eqiad - T405628
- 15:55 hashar@deploy2002: Finished deploy [gerrit/gerrit@121bd1c]: Remove duplicate [DISMISS] button (duration: 00m 11s)
- 15:55 hashar@deploy2002: Started deploy [gerrit/gerrit@121bd1c]: Remove duplicate [DISMISS] button
- 15:51 dpogorzelski@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ml-lab1001.eqiad.wmnet with reason: decomission
- 15:50 dpogorzelski@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on ml-lab1001.eqiad.wmnet with reason: decomission
- 15:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host conf2005.codfw.wmnet
- 15:45 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2003.codfw.wmnet
- 15:45 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2003.codfw.wmnet
- 15:44 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host conf2005.codfw.wmnet
- 15:43 hashar@deploy2002: Finished deploy [gerrit/gerrit@774e2ff]: Ease configuration of the motd banner && Add banner for the 2025 developer survey (duration: 00m 15s)
- 15:43 hashar@deploy2002: Started deploy [gerrit/gerrit@774e2ff]: Ease configuration of the motd banner && Add banner for the 2025 developer survey
- 15:41 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host conf2004.codfw.wmnet
- 15:38 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:38 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host conf2004.codfw.wmnet
- 15:35 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2003.codfw.wmnet
- 15:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host conf1009.eqiad.wmnet
- 15:30 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2003.codfw.wmnet
- 15:28 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host conf1009.eqiad.wmnet
- 15:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host conf1008.eqiad.wmnet
- 15:20 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host conf1008.eqiad.wmnet
- 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host conf1007.eqiad.wmnet
- 15:09 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host conf1007.eqiad.wmnet
- 15:08 cgoubert@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 15:06 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:06 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 15:06 cgoubert@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 15:05 cgoubert@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 15:03 Lucas_WMDE: UTC afternoon backport+config window done
- 15:03 cgoubert@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 15:03 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 15:02 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
- 15:02 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 15:02 ladsgroup@deploy2002: Finished scap sync-world: Backport for RevisionStore: Catch ParameterAssertionException too (T351953) (duration: 09m 26s)
- 15:01 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 14:59 cgoubert@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 14:59 cgoubert@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 14:59 cgoubert@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 14:58 cgoubert@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 14:55 ladsgroup@deploy2002: jforrester, ladsgroup: Continuing with sync
- 14:54 ladsgroup@deploy2002: jforrester, ladsgroup: Backport for RevisionStore: Catch ParameterAssertionException too (T351953) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:52 ladsgroup@deploy2002: Started scap sync-world: Backport for RevisionStore: Catch ParameterAssertionException too (T351953)
- 14:50 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
- 14:49 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
- 14:37 derick@deploy2002: Finished scap sync-world: Backport for Revert "User: Log where the data was loaded when CAS update failed" (T410652), Revert "User: Log where the data was loaded when CAS update failed" (T410652), Fetch user object from primary DB (for writes) not replica DB (T410652) (duration: 13m 24s)
- 14:27 derick@deploy2002: d3r1ck01, derick: Continuing with sync
- 14:26 derick@deploy2002: d3r1ck01, derick: Backport for Revert "User: Log where the data was loaded when CAS update failed" (T410652), Revert "User: Log where the data was loaded when CAS update failed" (T410652), Fetch user object from primary DB (for writes) not replica DB (T410652) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
- 14:23 derick@deploy2002: Started scap sync-world: Backport for Revert "User: Log where the data was loaded when CAS update failed" (T410652), Revert "User: Log where the data was loaded when CAS update failed" (T410652), Fetch user object from primary DB (for writes) not replica DB (T410652)
- 14:17 gehel@cumin2002: conftool action : set/weight=10; selector: service=druid-public-coordinator
- 14:17 gehel@cumin2002: conftool action : set/pooled=yes; selector: service=druid-public-coordinator
- 14:14 tchanders@deploy2002: Finished scap sync-world: Backport for Enable temporary accounts on enwikinews and ptwikibooks (T411618) (duration: 10m 36s)
- 14:11 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1166 (T410589)', diff saved to https://phabricator.wikimedia.org/P86406 and previous config saved to /var/cache/conftool/dbconfig/20251204-141124-ladsgroup.json
- 14:11 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 14:11 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T410589)', diff saved to https://phabricator.wikimedia.org/P86405 and previous config saved to /var/cache/conftool/dbconfig/20251204-141101-ladsgroup.json
- 14:08 tchanders@deploy2002: tchanders: Continuing with sync
- 14:06 tchanders@deploy2002: tchanders: Backport for Enable temporary accounts on enwikinews and ptwikibooks (T411618) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:03 tchanders@deploy2002: Started scap sync-world: Backport for Enable temporary accounts on enwikinews and ptwikibooks (T411618)
- 13:55 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P86404 and previous config saved to /var/cache/conftool/dbconfig/20251204-135554-ladsgroup.json
- 13:40 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P86403 and previous config saved to /var/cache/conftool/dbconfig/20251204-134046-ladsgroup.json
- 13:25 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T410589)', diff saved to https://phabricator.wikimedia.org/P86402 and previous config saved to /var/cache/conftool/dbconfig/20251204-132539-ladsgroup.json
- 13:22 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 13:22 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 13:19 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 13:19 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 13:16 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 13:15 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 13:15 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 13:14 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 13:07 moritzm: installing waitress security updates
- 12:45 moritzm: installing postgresql-15 security updates
- 11:31 moritzm: installing net-snmp security updates
- 11:21 moritzm: rebuild software RAIDs on T410743
- 11:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 10:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 09:48 moritzm: upgrade Envoy on an-launcher T405808
- 09:43 hashar@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.5 refs T408275
- 09:35 moritzm: cleanup lingering sessions of offboarded user T389324
- 09:30 hashar@deploy2002: Finished scap sync-world: Backport for REST: add explicit cast to sitemapSize calcuation to avoid warning (T411580), Followup I81a2c4de77: Verify stats label values are not empty (T411585) (duration: 09m 59s)
- 09:26 hashar@deploy2002: jforrester, hashar: Continuing with sync
- 09:23 hashar@deploy2002: jforrester, hashar: Backport for REST: add explicit cast to sitemapSize calcuation to avoid warning (T411580), Followup I81a2c4de77: Verify stats label values are not empty (T411585) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:22 arnoldokoth: upgrade envoyproxy on lists T405808
- 09:20 hashar@deploy2002: Started scap sync-world: Backport for REST: add explicit cast to sitemapSize calcuation to avoid warning (T411580), Followup I81a2c4de77: Verify stats label values are not empty (T411585)
- 09:20 arnoldokoth: upgrade envoyproxy on vrts T405808
- 09:19 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Arinaigum out of all services on: 2419 hosts
- 03:50 ejegg: fundraising civicrm upgraded from b1fc5afc to d4bd9b1b
- 01:23 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T410589)', diff saved to https://phabricator.wikimedia.org/P86394 and previous config saved to /var/cache/conftool/dbconfig/20251204-012321-ladsgroup.json
- 01:23 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 01:18 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 17m 47s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
2025-12-03
- 23:08 Amir1: hard rebooting codesearch9.codesearch.eqiad1.wikimedia.cloud (T411728)
- 22:51 mutante: maintenance on https://codesearch.wmcloud.org/ - trying to fix disk space issue - detaching volume to extend it
- 22:50 mutante: maintenance on https://codesearch.wmcloud.org/ - trying to fix disk space issue
- 22:33 ryankemper@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 22:33 ryankemper@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 22:14 ryankemper@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 22:13 ryankemper@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 22:09 ryankemper@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 22:08 ryankemper@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 21:53 aaron@deploy2002: Finished scap sync-world: Backport for Update Math API title and project-specific /math/ endpoint stability policy (T411517) (duration: 08m 25s)
- 21:49 aaron@deploy2002: aaron: Continuing with sync
- 21:47 aaron@deploy2002: aaron: Backport for Update Math API title and project-specific /math/ endpoint stability policy (T411517) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:45 aaron@deploy2002: Started scap sync-world: Backport for Update Math API title and project-specific /math/ endpoint stability policy (T411517)
- 21:42 derick@deploy2002: Finished scap sync-world: Backport for User: Log where the data was loaded when CAS update failed (T410652), User: Log where the data was loaded when CAS update failed (T410652) (duration: 07m 33s)
- 21:38 derick@deploy2002: derick, d3r1ck01: Continuing with sync
- 21:37 derick@deploy2002: derick, d3r1ck01: Backport for User: Log where the data was loaded when CAS update failed (T410652), User: Log where the data was loaded when CAS update failed (T410652) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:35 derick@deploy2002: Started scap sync-world: Backport for User: Log where the data was loaded when CAS update failed (T410652), User: Log where the data was loaded when CAS update failed (T410652)
- 21:28 dani@deploy2002: Finished scap sync-world: Backport for Increase coverage of 2025 Global Readers Survey (non-enwiki) (T410918), OATHAuth: Expand 2FA to all users (T399664) (duration: 11m 18s)
- 21:24 dani@deploy2002: dani, mstyles: Continuing with sync
- 21:19 dani@deploy2002: dani, mstyles: Backport for Increase coverage of 2025 Global Readers Survey (non-enwiki) (T410918), OATHAuth: Expand 2FA to all users (T399664) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:17 dani@deploy2002: Started scap sync-world: Backport for Increase coverage of 2025 Global Readers Survey (non-enwiki) (T410918), OATHAuth: Expand 2FA to all users (T399664)
- 21:14 aude@deploy2002: Finished scap sync-world: Backport for [Legal Footer] Create config for adding legal footer (T410163) (duration: 08m 38s)
- 21:10 aude@deploy2002: aude, lmora: Continuing with sync
- 21:08 aude@deploy2002: aude, lmora: Backport for [Legal Footer] Create config for adding legal footer (T410163) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:05 aude@deploy2002: Started scap sync-world: Backport for [Legal Footer] Create config for adding legal footer (T410163)
- 20:53 aqu@deploy2002: Finished deploy [analytics/refinery@6dfb3b8] (thin): Deploy spur hqls THIN [analytics/refinery@6dfb3b8b] (duration: 01m 16s)
- 20:51 aqu@deploy2002: Started deploy [analytics/refinery@6dfb3b8] (thin): Deploy spur hqls THIN [analytics/refinery@6dfb3b8b]
- 20:51 aqu@deploy2002: Finished deploy [analytics/refinery@6dfb3b8]: Deploy spur hqls [analytics/refinery@6dfb3b8b] (duration: 02m 29s)
- 20:49 aqu@deploy2002: Started deploy [analytics/refinery@6dfb3b8]: Deploy spur hqls [analytics/refinery@6dfb3b8b]
- 20:48 aqu@deploy2002: Finished deploy [analytics/refinery@6dfb3b8] (hadoop-test): Deploy spur hqls TEST [analytics/refinery@6dfb3b8b] (duration: 01m 01s)
- 20:47 aqu@deploy2002: Started deploy [analytics/refinery@6dfb3b8] (hadoop-test): Deploy spur hqls TEST [analytics/refinery@6dfb3b8b]
- 20:44 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudservices1005.eqiad.wmnet with reason: host reimage
- 20:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1020.eqiad.wmnet
- 20:43 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1020.eqiad.wmnet
- 20:40 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudservices1005.eqiad.wmnet with reason: host reimage
- 20:25 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudservices1005.eqiad.wmnet with OS trixie
- 20:22 eileen: civicrm upgraded from 45931830 to b1fc5afc
- 20:02 ejegg: payments-wiki upgraded from eeadc2d8 to 714ed4cf
- 20:00 eileen: civicrm upgraded from c6d1f24b to 45931830
- 19:58 sukhe@dns1004: END - running authdns-update
- 19:57 sukhe@dns1004: START - running authdns-update
- 19:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T410589)', diff saved to https://phabricator.wikimedia.org/P86392 and previous config saved to /var/cache/conftool/dbconfig/20251203-195207-ladsgroup.json
- 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1020.eqiad.wmnet with OS bullseye
- 19:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P86390 and previous config saved to /var/cache/conftool/dbconfig/20251203-193659-ladsgroup.json
- 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1020.eqiad.wmnet with reason: host reimage
- 19:23 hashar@deploy2002: Finished deploy [gerrit/gerrit@93bde2a]: Ease configuration of the motd banner (duration: 00m 09s)
- 19:22 hashar@deploy2002: Started deploy [gerrit/gerrit@93bde2a]: Ease configuration of the motd banner
- 19:22 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cf (exit_code=0)
- 19:22 cmooney@cumin1003: START - Cookbook sre.network.cf
- 19:22 topranks: disabling remote announcement of bgp prefixes
- 19:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P86388 and previous config saved to /var/cache/conftool/dbconfig/20251203-192152-ladsgroup.json
- 19:21 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1020.eqiad.wmnet with reason: host reimage
- 19:14 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudservices1006.eqiad.wmnet with OS trixie
- 19:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T410589)', diff saved to https://phabricator.wikimedia.org/P86387 and previous config saved to /var/cache/conftool/dbconfig/20251203-190644-ladsgroup.json
- 19:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1020.eqiad.wmnet with OS bullseye
- 18:37 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirtlocal1001.eqiad.wmnet with OS trixie
- 18:26 ladsgroup@deploy2002: Finished scap sync-world: Backport for findBadBlobs: Fix the --scan-to option (T351953), findBadBlobs: Fix the --scan-to option (T351953) (duration: 06m 48s)
- 18:25 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1020.eqiad.wmnet with reason: move primary uplink from asw2-d7-eqiad to lsw1-d7-eqiad and remove link to asw2-c2-eqiad
- 18:22 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 18:22 ladsgroup@deploy2002: ladsgroup: Backport for findBadBlobs: Fix the --scan-to option (T351953), findBadBlobs: Fix the --scan-to option (T351953) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:19 ladsgroup@deploy2002: Started scap sync-world: Backport for findBadBlobs: Fix the --scan-to option (T351953), findBadBlobs: Fix the --scan-to option (T351953)
- 18:12 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudservices1006.eqiad.wmnet with reason: host reimage
- 18:08 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudservices1006.eqiad.wmnet with reason: host reimage
- 18:05 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:05 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating for cloudceph to codfw - jhancock@cumin1003"
- 18:04 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating for cloudceph to codfw - jhancock@cumin1003"
- 18:01 jhancock@cumin1003: START - Cookbook sre.dns.netbox
- 18:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
- 17:57 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
- 17:50 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudservices1006.eqiad.wmnet with OS trixie
- 17:46 sukhe@cumin1003: END (PASS) - Cookbook sre.network.cf (exit_code=0)
- 17:46 sukhe@cumin1003: START - Cookbook sre.network.cf
- 17:46 sukhe@cumin1003: END (FAIL) - Cookbook sre.network.cf (exit_code=1)
- 17:46 sukhe@cumin1003: START - Cookbook sre.network.cf
- 17:46 sukhe@cumin1003: END (PASS) - Cookbook sre.network.cf (exit_code=0)
- 17:46 sukhe@cumin1003: START - Cookbook sre.network.cf
- 17:46 sukhe@cumin1003: END (PASS) - Cookbook sre.network.cf (exit_code=0)
- 17:45 sukhe@cumin1003: START - Cookbook sre.network.cf
- 17:40 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS trixie
- 17:40 sbisson@deploy2002: Finished scap sync-world: Backport for CX3 Build 1.0.0+20251126 (T384485) (duration: 09m 07s)
- 17:36 sbisson@deploy2002: sbisson: Continuing with sync
- 17:34 sbisson@deploy2002: sbisson: Backport for CX3 Build 1.0.0+20251126 (T384485) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:31 sbisson@deploy2002: Started scap sync-world: Backport for CX3 Build 1.0.0+20251126 (T384485)
- 17:11 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1229.eqiad.wmnet with reason: crashed
- 17:07 jynus@cumin1003: dbctl commit (dc=all): 'Depooldb1229', diff saved to https://phabricator.wikimedia.org/P86383 and previous config saved to /var/cache/conftool/dbconfig/20251203-170745-jynus.json
- 17:02 bd808@deploy2002: Finished scap sync-world: Backport for robots.php: Fix undefined index 'enabled' on Wikinews and closed wikis (T411632) (duration: 07m 40s)
- 16:58 bd808@deploy2002: bd808, krinkle: Continuing with sync
- 16:57 bd808@deploy2002: bd808, krinkle: Backport for robots.php: Fix undefined index 'enabled' on Wikinews and closed wikis (T411632) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:54 bd808@deploy2002: Started scap sync-world: Backport for robots.php: Fix undefined index 'enabled' on Wikinews and closed wikis (T411632)
- 16:49 bd808@deploy2002: Finished scap sync-world: Backport for officewiki: Put indicators in title with vector-2022, officewiki: Enable page protection indicators (duration: 07m 47s)
- 16:45 bd808@deploy2002: bd808: Continuing with sync
- 16:44 bd808@deploy2002: bd808: Backport for officewiki: Put indicators in title with vector-2022, officewiki: Enable page protection indicators synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:41 bd808@deploy2002: Started scap sync-world: Backport for officewiki: Put indicators in title with vector-2022, officewiki: Enable page protection indicators
- 16:15 topranks: disabling unused former cloudcephosd hosts on cloud switches T410989
- 16:13 dancy@deploy2002: Installation of scap version "4.229.0" completed for 164 hosts
- 16:09 dancy@deploy2002: Installing scap version "4.229.0" for 164 host(s)
- 15:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host conf2006.codfw.wmnet
- 15:28 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:27 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:27 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:27 ladsgroup@deploy2002: Finished scap sync-world: Backport for Clean up db groups config (T411088) (duration: 07m 48s)
- 15:27 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host conf2006.codfw.wmnet
- 15:26 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:26 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:23 ladsgroup@deploy2002: ladsgroup: Continuing with sync
- 15:23 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:22 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:21 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:21 ladsgroup@deploy2002: ladsgroup: Backport for Clean up db groups config (T411088) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:21 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:20 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:20 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:19 ladsgroup@deploy2002: Started scap sync-world: Backport for Clean up db groups config (T411088)
- 15:16 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
- 15:16 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
- 15:15 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:15 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:14 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:13 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:12 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:12 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:09 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 15:08 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 15:08 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 15:07 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 15:06 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 15:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
- 15:06 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 15:04 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
- 15:03 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
- 15:00 robh@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on alert1002.wikimedia.org with reason: C/D Migration
- 15:00 robh: alert1002 port migration now starting
- 14:54 Lucas_WMDE: UTC afternoon backport+config window done
- 14:49 esanders@deploy2002: Finished scap sync-world: Backport for DiscussionTools: cleanup unused config, Remove wgVisualEditorEditCheckSingleCheckMode (duration: 06m 44s)
- 14:45 esanders@deploy2002: esanders: Continuing with sync
- 14:44 esanders@deploy2002: esanders: Backport for DiscussionTools: cleanup unused config, Remove wgVisualEditorEditCheckSingleCheckMode synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:42 esanders@deploy2002: Started scap sync-world: Backport for DiscussionTools: cleanup unused config, Remove wgVisualEditorEditCheckSingleCheckMode
- 14:38 esanders@deploy2002: Finished scap sync-world: Backport for Set Flow to read-only everywhere (T402552) (duration: 09m 44s)
- 14:33 esanders@deploy2002: esanders: Continuing with sync
- 14:31 esanders@deploy2002: esanders: Backport for Set Flow to read-only everywhere (T402552) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:29 esanders@deploy2002: Started scap sync-world: Backport for Set Flow to read-only everywhere (T402552)
- 14:27 XioNoX: push pfw policies - T411566
- 14:27 sbisson@deploy2002: Finished scap sync-world: Backport for CX3 Build 1.0.0+20251201 (T408842 T408844) (duration: 12m 01s)
- 14:21 sbisson@deploy2002: sbisson: Continuing with sync
- 14:17 sbisson@deploy2002: sbisson: Backport for CX3 Build 1.0.0+20251201 (T408842 T408844) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:15 sbisson@deploy2002: Started scap sync-world: Backport for CX3 Build 1.0.0+20251201 (T408842 T408844)
- 13:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
- 13:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86380 and previous config saved to /var/cache/conftool/dbconfig/20251203-135000-marostegui.json
- 13:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P86379 and previous config saved to /var/cache/conftool/dbconfig/20251203-133452-marostegui.json
- 13:32 kart_: Updated Recommendation API to 2025-12-02-200719-production (T408845, T408844, T384485)
- 13:30 kartik@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:25 kartik@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:22 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
- 13:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P86378 and previous config saved to /var/cache/conftool/dbconfig/20251203-131945-marostegui.json
- 13:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2229 (T410589)', diff saved to https://phabricator.wikimedia.org/P86377 and previous config saved to /var/cache/conftool/dbconfig/20251203-131448-ladsgroup.json
- 13:14 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
- 13:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T410589)', diff saved to https://phabricator.wikimedia.org/P86376 and previous config saved to /var/cache/conftool/dbconfig/20251203-131435-ladsgroup.json
- 13:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86375 and previous config saved to /var/cache/conftool/dbconfig/20251203-130437-marostegui.json
- 13:01 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 13:00 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 13:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2227 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86374 and previous config saved to /var/cache/conftool/dbconfig/20251203-130002-marostegui.json
- 12:59 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
- 12:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86373 and previous config saved to /var/cache/conftool/dbconfig/20251203-125938-marostegui.json
- 12:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P86372 and previous config saved to /var/cache/conftool/dbconfig/20251203-125927-ladsgroup.json
- 12:57 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 12:56 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 12:56 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
- 12:55 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
- 12:54 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
- 12:53 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
- 12:52 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 12:52 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
- 12:51 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
- 12:51 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 12:50 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 12:50 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 12:50 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 12:49 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 12:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P86371 and previous config saved to /var/cache/conftool/dbconfig/20251203-124430-marostegui.json
- 12:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P86370 and previous config saved to /var/cache/conftool/dbconfig/20251203-124419-ladsgroup.json
- 12:32 claime: Restarting failed timer dump_cloud_ip_ranges on puppetservers
- 12:30 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 12:30 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 12:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P86369 and previous config saved to /var/cache/conftool/dbconfig/20251203-122923-marostegui.json
- 12:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T410589)', diff saved to https://phabricator.wikimedia.org/P86368 and previous config saved to /var/cache/conftool/dbconfig/20251203-122912-ladsgroup.json
- 12:26 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 12:26 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 12:20 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 12:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 12:19 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 12:19 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 12:18 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 12:17 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 12:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86367 and previous config saved to /var/cache/conftool/dbconfig/20251203-121409-marostegui.json
- 12:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2209 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86366 and previous config saved to /var/cache/conftool/dbconfig/20251203-120933-marostegui.json
- 12:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
- 12:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86365 and previous config saved to /var/cache/conftool/dbconfig/20251203-120909-marostegui.json
- 11:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P86364 and previous config saved to /var/cache/conftool/dbconfig/20251203-115401-marostegui.json
- 11:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P86363 and previous config saved to /var/cache/conftool/dbconfig/20251203-113853-marostegui.json
- 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86362 and previous config saved to /var/cache/conftool/dbconfig/20251203-112345-marostegui.json
- 11:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2194 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86361 and previous config saved to /var/cache/conftool/dbconfig/20251203-111910-marostegui.json
- 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
- 11:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86360 and previous config saved to /var/cache/conftool/dbconfig/20251203-111846-marostegui.json
- 11:15 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 11:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1013
- 11:07 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1013
- 11:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P86359 and previous config saved to /var/cache/conftool/dbconfig/20251203-110338-marostegui.json
- 10:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2001
- 10:53 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2001
- 10:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P86358 and previous config saved to /var/cache/conftool/dbconfig/20251203-104830-marostegui.json
- 10:35 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptchaEditAttempt logging: Normalize line endings (T411578), hCaptchaEditAttempt logging: Normalize line endings (T411578) (duration: 07m 56s)
- 10:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86357 and previous config saved to /var/cache/conftool/dbconfig/20251203-103323-marostegui.json
- 10:30 kharlan@deploy2002: kharlan: Continuing with sync
- 10:29 kharlan@deploy2002: kharlan: Backport for hCaptchaEditAttempt logging: Normalize line endings (T411578), hCaptchaEditAttempt logging: Normalize line endings (T411578) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:27 kharlan@deploy2002: Started scap sync-world: Backport for hCaptchaEditAttempt logging: Normalize line endings (T411578), hCaptchaEditAttempt logging: Normalize line endings (T411578)
- 09:19 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.5 refs T408275
- 09:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 09:14 ayounsi@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 09:00 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on ganeti-test2001.codfw.wmnet with reason: test CR1207804
- 08:37 moritzm: upgrade Envoy on schema* T405808
- 08:32 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99) for Hadoop analytics cluster
- 08:13 moritzm: installing python-zipp security updates
- 07:47 moritzm: installing libtpms security updates
- 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1169 gradually with 4 steps - Repooling db1169
- 07:12 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 07:05 moritzm: installing mako security updates
- 07:01 Amir1: ladsgroup@deploy2002:~$ mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 11 rememberpassword (T406724)
- 06:56 Amir1: ladsgroup@deploy2002:~$ mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 11 popups (T406724)
- 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1169 gradually with 4 steps - Repooling db1169
- 06:39 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) db1169 gradually with 4 steps - Repooling db1169
- 06:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2224 (T410589)', diff saved to https://phabricator.wikimedia.org/P86350 and previous config saved to /var/cache/conftool/dbconfig/20251203-063812-ladsgroup.json
- 06:38 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
- 06:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T410589)', diff saved to https://phabricator.wikimedia.org/P86349 and previous config saved to /var/cache/conftool/dbconfig/20251203-063749-ladsgroup.json
- 06:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1169 gradually with 4 steps - Repooling db1169
- 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1169 - Depooling db1169
- 06:29 marostegui@cumin1003: START - Cookbook sre.mysql.depool db1169 - Depooling db1169
- 06:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1169.eqiad.wmnet with OS trixie
- 06:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P86348 and previous config saved to /var/cache/conftool/dbconfig/20251203-062241-ladsgroup.json
- 06:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 06:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 06:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 06:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 06:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P86345 and previous config saved to /var/cache/conftool/dbconfig/20251203-060734-ladsgroup.json
- 06:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
- 05:58 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
- 05:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T410589)', diff saved to https://phabricator.wikimedia.org/P86344 and previous config saved to /var/cache/conftool/dbconfig/20251203-055226-ladsgroup.json
- 05:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86343 and previous config saved to /var/cache/conftool/dbconfig/20251203-054438-marostegui.json
- 05:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
- 05:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86342 and previous config saved to /var/cache/conftool/dbconfig/20251203-054414-marostegui.json
- 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS trixie
- 05:36 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1011.eqiad.wmnet with OS trixie
- 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P86341 and previous config saved to /var/cache/conftool/dbconfig/20251203-052906-marostegui.json
- 05:27 marostegui: Drop sockpuppet database T411527
- 05:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P86340 and previous config saved to /var/cache/conftool/dbconfig/20251203-051359-marostegui.json
- 04:59 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1011.eqiad.wmnet with reason: host reimage
- 04:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86339 and previous config saved to /var/cache/conftool/dbconfig/20251203-045851-marostegui.json
- 04:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1223.eqiad.wmnet with reason: Maintenance
- 04:55 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1011.eqiad.wmnet with reason: host reimage
- 04:34 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1011.eqiad.wmnet with OS trixie
- 04:26 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1007.eqiad.wmnet with OS trixie
- 03:50 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1007.eqiad.wmnet with reason: host reimage
- 03:46 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1007.eqiad.wmnet with reason: host reimage
- 03:30 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1007.eqiad.wmnet with OS trixie
- 03:26 krinkle@deploy2002: Finished scap sync-world: Backport for robots.php: Avoid "404 Not Found" for Sitemap rule (T400023) (duration: 11m 08s)
- 03:22 krinkle@deploy2002: krinkle: Continuing with sync
- 03:17 krinkle@deploy2002: krinkle: Backport for robots.php: Avoid "404 Not Found" for Sitemap rule (T400023) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 03:15 krinkle@deploy2002: Started scap sync-world: Backport for robots.php: Avoid "404 Not Found" for Sitemap rule (T400023)
- 03:08 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1006.eqiad.wmnet with OS trixie
- 03:08 krinkle@deploy2002: Finished scap sync-world: Backport for robots.php: Clean up unused site, lang, and x-subdomain (T407122), Submit Commons sitemap to Bing/DuckDuckGo and remaining wikis to Google (T400023), robots.txt: Clean up inline comments, robots.txt: Remove redundant "/wiki/Fundraising_2007/comments" disallow (duration: 08m 26s)
- 03:03 krinkle@deploy2002: krinkle: Continuing with sync
- 03:02 krinkle@deploy2002: krinkle: Backport for robots.php: Clean up unused site, lang, and x-subdomain (T407122), Submit Commons sitemap to Bing/DuckDuckGo and remaining wikis to Google (T400023), robots.txt: Clean up inline comments, robots.txt: Remove redundant "/wiki/Fundraising_2007/comments" disallow synced to the testservers (see https://wiki
- 02:59 krinkle@deploy2002: Started scap sync-world: Backport for robots.php: Clean up unused site, lang, and x-subdomain (T407122), Submit Commons sitemap to Bing/DuckDuckGo and remaining wikis to Google (T400023), robots.txt: Clean up inline comments, robots.txt: Remove redundant "/wiki/Fundraising_2007/comments" disallow
- 02:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1006.eqiad.wmnet with reason: host reimage
- 02:27 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1006.eqiad.wmnet with reason: host reimage
- 02:13 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1006.eqiad.wmnet with OS trixie
- 02:05 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1006.eqiad.wmnet with OS trixie
- 01:50 eileen: civicrm upgraded from ef0b2676 to c6d1f24b
- 01:23 ryankemper@cumin2002: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
- 01:21 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99) for Hadoop analytics cluster
- 01:18 ryankemper@cumin2002: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
- 01:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 30s)
- 01:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:50 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcontrol1006.eqiad.wmnet with OS trixie
- 00:33 zabe@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.5 refs T408275
- 00:24 zabe@deploy2002: Finished scap sync-world: Backport for Close klwiki (T411501) (duration: 07m 29s)
- 00:20 zabe@deploy2002: zabe: Continuing with sync
- 00:19 zabe@deploy2002: zabe: Backport for Close klwiki (T411501) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:17 zabe@deploy2002: Started scap sync-world: Backport for Close klwiki (T411501)
- 00:09 zabe@deploy2002: Finished scap sync-world: Backport for Close crwiki (T411501) (duration: 07m 59s)
- 00:05 zabe@deploy2002: zabe: Continuing with sync
- 00:04 zabe@deploy2002: zabe: Backport for Close crwiki (T411501) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:01 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2217 (T410589)', diff saved to https://phabricator.wikimedia.org/P86338 and previous config saved to /var/cache/conftool/dbconfig/20251203-000140-ladsgroup.json
- 00:01 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
- 00:01 zabe@deploy2002: Started scap sync-world: Backport for Close crwiki (T411501)
2025-12-02
- 23:43 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86337 and previous config saved to /var/cache/conftool/dbconfig/20251202-234356-marostegui.json
- 23:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
- 23:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86336 and previous config saved to /var/cache/conftool/dbconfig/20251202-234332-marostegui.json
- 23:41 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1002.eqiad.wmnet with OS trixie
- 23:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P86335 and previous config saved to /var/cache/conftool/dbconfig/20251202-232824-marostegui.json
- 23:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1002.eqiad.wmnet with reason: host reimage
- 23:23 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:23 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: move IPv6 gerrit-lb to IPs ending in ::2 T365259 - dzahn@cumin2002"
- 23:22 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: move IPv6 gerrit-lb to IPs ending in ::2 T365259 - dzahn@cumin2002"
- 23:17 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1002.eqiad.wmnet with reason: host reimage
- 23:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P86334 and previous config saved to /var/cache/conftool/dbconfig/20251202-231317-marostegui.json
- 23:09 eileen: civicrm upgraded from 8d8400e1 to ef0b2676
- 23:02 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1001.eqiad.wmnet with OS trixie
- 23:01 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudrabbit1002.eqiad.wmnet with OS trixie
- 23:00 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudrabbit1003.eqiad.wmnet with OS trixie
- 22:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86333 and previous config saved to /var/cache/conftool/dbconfig/20251202-225809-marostegui.json
- 22:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86332 and previous config saved to /var/cache/conftool/dbconfig/20251202-225122-marostegui.json
- 22:45 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1001.eqiad.wmnet with reason: host reimage
- 22:42 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
- 22:41 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 22:39 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1001.eqiad.wmnet with reason: host reimage
- 22:38 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudrabbit1003.eqiad.wmnet with reason: host reimage
- 22:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P86331 and previous config saved to /var/cache/conftool/dbconfig/20251202-223615-marostegui.json
- 22:33 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 22:32 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 22:25 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudrabbit1001.eqiad.wmnet with OS trixie
- 22:23 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudrabbit1003.eqiad.wmnet with OS trixie
- 22:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P86330 and previous config saved to /var/cache/conftool/dbconfig/20251202-222107-marostegui.json
- 22:20 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1061.eqiad.wmnet with OS trixie
- 22:09 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1054.eqiad.wmnet with OS trixie
- 22:09 catrope@deploy2002: Finished scap sync-world: Backport for CentralAuthUser: Add debugging information for T385310 (T385310), CentralAuthUser: Add debugging information for T385310 (T385310) (duration: 07m 29s)
- 22:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86329 and previous config saved to /var/cache/conftool/dbconfig/20251202-220600-marostegui.json
- 22:05 catrope@deploy2002: catrope, matmarex: Continuing with sync
- 22:04 catrope@deploy2002: catrope, matmarex: Backport for CentralAuthUser: Add debugging information for T385310 (T385310), CentralAuthUser: Add debugging information for T385310 (T385310) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:01 catrope@deploy2002: Started scap sync-world: Backport for CentralAuthUser: Add debugging information for T385310 (T385310), CentralAuthUser: Add debugging information for T385310 (T385310)
- 21:55 dani@deploy2002: Finished scap sync-world: Backport for [beta] Undeploy experiment for 2025 Global Readers Survey (T410696), Deploy 2025 Global Readers Survey (non-enwiki) (T410918) (duration: 10m 23s)
- 21:51 dani@deploy2002: dani: Continuing with sync
- 21:47 dani@deploy2002: dani: Backport for [beta] Undeploy experiment for 2025 Global Readers Survey (T410696), Deploy 2025 Global Readers Survey (non-enwiki) (T410918) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:44 dani@deploy2002: Started scap sync-world: Backport for [beta] Undeploy experiment for 2025 Global Readers Survey (T410696), Deploy 2025 Global Readers Survey (non-enwiki) (T410918)
- 21:43 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:43 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:42 kgraessle@deploy2002: Finished scap sync-world: Backport for Enable revertrisk filters in thwiki (T409438) (duration: 10m 34s)
- 21:38 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['backup2013']
- 21:38 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2013']
- 21:38 kgraessle@deploy2002: kgraessle: Continuing with sync
- 21:36 kgraessle@deploy2002: kgraessle: Backport for Enable revertrisk filters in thwiki (T409438) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:34 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:34 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:33 bking@dns1004: END - running authdns-update
- 21:32 bking@dns1004: START - running authdns-update
- 21:31 kgraessle@deploy2002: Started scap sync-world: Backport for Enable revertrisk filters in thwiki (T409438)
- 21:31 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 21:29 kharlan@deploy2002: Finished scap sync-world: Backport for Refactor: Move editing session ID logic into service (T406865), hCaptcha: Log diff when challenge is presented (T406865) (duration: 59m 06s)
- 21:26 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1050.eqiad.wmnet with OS trixie
- 21:20 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1061.eqiad.wmnet with reason: host reimage
- 21:17 kharlan@deploy2002: kharlan: Continuing with sync
- 21:16 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1061.eqiad.wmnet with reason: host reimage
- 21:15 kharlan@deploy2002: kharlan: Backport for Refactor: Move editing session ID logic into service (T406865), hCaptcha: Log diff when challenge is presented (T406865) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:14 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:14 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP - IPv6 - for ulsfo and magru T365259 - dzahn@cumin2002"
- 21:14 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP - IPv6 - for ulsfo and magru T365259 - dzahn@cumin2002"
- 21:10 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 21:04 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:03 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP - IPv6 - for drmrs, eqsin and esams T365259 - dzahn@cumin2002"
- 21:03 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP - IPv6 - for drmrs, eqsin and esams T365259 - dzahn@cumin2002"
- 21:00 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1061.eqiad.wmnet with OS trixie
- 21:00 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1046.eqiad.wmnet with OS trixie
- 20:58 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 20:52 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:52 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP - IPv6 - for codfw and eqiad T365259 - dzahn@cumin2002"
- 20:52 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP - IPv6 - for codfw and eqiad T365259 - dzahn@cumin2002"
- 20:48 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1054.eqiad.wmnet with reason: host reimage
- 20:48 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 20:44 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1054.eqiad.wmnet with reason: host reimage
- 20:43 eileen: civicrm upgraded from c90bd037 to 8d8400e1
- 20:38 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1050.eqiad.wmnet with reason: host reimage
- 20:37 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:37 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP for magru and eqiad T365259 - dzahn@cumin2002"
- 20:37 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP for magru and eqiad T365259 - dzahn@cumin2002"
- 20:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1046.eqiad.wmnet with reason: host reimage
- 20:33 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 20:31 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:31 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP for drmrs and eqsin T365259 - dzahn@cumin2002"
- 20:31 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP for drmrs and eqsin T365259 - dzahn@cumin2002"
- 20:30 kharlan@deploy2002: Started scap sync-world: Backport for Refactor: Move editing session ID logic into service (T406865), hCaptcha: Log diff when challenge is presented (T406865)
- 20:29 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1050.eqiad.wmnet with reason: host reimage
- 20:28 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1054.eqiad.wmnet with OS trixie
- 20:26 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 20:26 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1046.eqiad.wmnet with reason: host reimage
- 20:18 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1005.eqiad.wmnet with OS bookworm
- 20:18 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:18 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP for esams and ulsfo T365259 - dzahn@cumin2002"
- 20:18 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb VIP for esams and ulsfo T365259 - dzahn@cumin2002"
- 20:13 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 20:12 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1050.eqiad.wmnet with OS trixie
- 20:09 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirt1046.eqiad.wmnet with OS trixie
- 19:58 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1005.eqiad.wmnet with reason: host reimage
- 19:53 jhathaway@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1005.eqiad.wmnet with reason: host reimage
- 19:52 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:49 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 19:48 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:48 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb.codfw.wikimedia.org T365259 - dzahn@cumin2002"
- 19:46 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added gerrit-lb.codfw.wikimedia.org T365259 - dzahn@cumin2002"
- 19:43 cstone: payments-wiki upgraded from 6d39e545 to eeadc2d8
- 19:42 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 19:34 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 19:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 19:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 19:07 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3010*} and A:liberica
- 19:03 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3010*} and A:liberica
- 18:53 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic1-eqiad (T352245)
- 18:53 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic1-eqiad (T352245)
- 18:52 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1001-dev.eqiad.wmnet with OS trixie
- 18:47 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic2-eqiad (T352245)
- 18:47 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic2-eqiad (T352245)
- 18:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 18:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 18:41 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-eqiad (T352245)
- 18:40 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-eqiad (T352245)
- 18:36 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-eqiad (T352245)
- 18:36 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad (T352245)
- 18:32 Emperor: repool ms-fe2014 T410959
- 18:27 swfrench@deploy2002: Unlocked for deployment [MediaWiki]: Hold deployments during etcd certificate change - T352245 (duration: 17m 35s)
- 18:26 swfrench-wmf: restarted navtiming on webperf1003 - T352245
- 18:23 swfrench-wmf: begin rolling restarts of eqiad-associated confds - T352245
- 18:22 swfrench-wmf: migrating etcd to PKI certs on conf1007 - T352245
- 18:19 swfrench-wmf: deleted EtcdReplicationDown silence (42a82757-2075-44fd-b057-ec9ed2afeb90) - T352245
- 18:16 swfrench-wmf: manually transferred etcd replication source back to conf1009 - T352245
- 18:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 18:15 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 18:12 swfrench-wmf: migrating etcd to PKI certs on conf1009 - T352245
- 18:10 swfrench@deploy2002: Locking from deployment [MediaWiki]: Hold deployments during etcd certificate change - T352245
- 18:08 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1005.eqiad.wmnet with OS bookworm
- 18:06 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1208442 T407553 (duration: 06m 36s)
- 18:04 swfrench-wmf: manually transferred codfw etcd replication source to conf1008 - T352245
- 18:02 rzl@deploy2002: rzl: Continuing with sync
- 18:01 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1208442 T407553 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:01 swfrench-wmf: silenced EtcdReplicationDown (42a82757-2075-44fd-b057-ec9ed2afeb90) - T352245
- 18:00 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1208442 T407553
- 17:48 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1005.eqiad.wmnet with reason: host reimage
- 17:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86328 and previous config saved to /var/cache/conftool/dbconfig/20251202-174732-marostegui.json
- 17:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: Maintenance
- 17:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 17:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1251.eqiad.wmnet onto db1169.eqiad.wmnet
- 17:43 jhathaway@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1005.eqiad.wmnet with reason: host reimage
- 17:42 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86327 and previous config saved to /var/cache/conftool/dbconfig/20251202-174249-marostegui.json
- 17:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 17:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86326 and previous config saved to /var/cache/conftool/dbconfig/20251202-174225-marostegui.json
- 17:29 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudbackup1001-dev.eqiad.wmnet with reason: host reimage
- 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P86325 and previous config saved to /var/cache/conftool/dbconfig/20251202-172717-marostegui.json
- 17:24 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 17:22 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudbackup1001-dev.eqiad.wmnet with reason: host reimage
- 17:21 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
- 17:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T410589)', diff saved to https://phabricator.wikimedia.org/P86324 and previous config saved to /var/cache/conftool/dbconfig/20251202-172134-ladsgroup.json
- 17:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P86323 and previous config saved to /var/cache/conftool/dbconfig/20251202-171210-marostegui.json
- 17:10 jhathaway@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1005.eqiad.wmnet with OS bookworm
- 17:09 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudbackup1001-dev.eqiad.wmnet with OS trixie
- 17:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P86322 and previous config saved to /var/cache/conftool/dbconfig/20251202-170627-ladsgroup.json
- 17:06 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic1-eqiad (T352245)
- 17:05 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic1-eqiad (T352245)
- 17:03 brett: import varnish-modules 0.20.0-2~deb13+wmf1 into trixie-wikimedia - T401832
- 17:02 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-high-traffic2-eqiad (T352245)
- 17:01 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-high-traffic2-eqiad (T352245)
- 16:59 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-eqiad (T352245)
- 16:58 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-eqiad (T352245)
- 16:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86321 and previous config saved to /var/cache/conftool/dbconfig/20251202-165702-marostegui.json
- 16:54 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 16:53 swfrench@cumin2002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-eqiad (T352245)
- 16:53 swfrench@cumin2002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad (T352245)
- 16:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P86320 and previous config saved to /var/cache/conftool/dbconfig/20251202-165119-ladsgroup.json
- 16:51 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 16:44 ihurbain@deploy2002: Finished scap sync-world: Backport for Bump parsoid to v0.23.0-a7.1 on wmf.4 (T411238 T410960), Bump parsoid to v0.23.0-a7.1 on wmf.4 (T411238 T410960) (duration: 09m 21s)
- 16:43 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 16:43 inflatador: bking@wmf3062 restart WDQS codfw to resolve lag/possible deadlocks
- 16:39 ihurbain@deploy2002: ihurbain: Continuing with sync
- 16:39 jhathaway@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1005.eqiad.wmnet with OS bookworm
- 16:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 16:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 16:37 ihurbain@deploy2002: ihurbain: Backport for Bump parsoid to v0.23.0-a7.1 on wmf.4 (T411238 T410960), Bump parsoid to v0.23.0-a7.1 on wmf.4 (T411238 T410960) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T410589)', diff saved to https://phabricator.wikimedia.org/P86319 and previous config saved to /var/cache/conftool/dbconfig/20251202-163612-ladsgroup.json
- 16:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1251 gradually with 4 steps - Pool db1251.eqiad.wmnet in after cloning
- 16:35 ihurbain@deploy2002: Started scap sync-world: Backport for Bump parsoid to v0.23.0-a7.1 on wmf.4 (T411238 T410960), Bump parsoid to v0.23.0-a7.1 on wmf.4 (T411238 T410960)
- 16:30 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 16:27 brett: import varnish 7.1.1-2~bpo13+wmf2 into trixie-wikimedia - T401832
- 16:24 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 16:23 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 16:20 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 16:19 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 16:18 swfrench-wmf: restarted navtiming on webperf1003 - T352245
- 16:14 swfrench-wmf: begin rolling restarts of eqiad-associated confds - T352245
- 16:12 moritzm: installing nodejs security updates
- 16:12 swfrench@deploy2002: Unlocked for deployment [MediaWiki]: Hold deployments during etcd certificate change - T352245 (duration: 03m 45s)
- 16:12 jhathaway@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm
- 16:10 jhathaway@cumin1003: START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm
- 16:08 swfrench@deploy2002: Locking from deployment [MediaWiki]: Hold deployments during etcd certificate change - T352245
- 16:08 swfrench-wmf: migrating etcd to PKI certs on conf1008 - T352245
- 16:08 jhathaway@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1005.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 16:02 moritzm: installing libsndfile security updates
- 16:01 jhathaway@cumin1003: START - Cookbook sre.hosts.provision for host sretest1005.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 16:00 gehel: restarting wdqs@codfw - system overloaded
- 15:58 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on sretest1005.eqiad.wmnet with reason: ipxe
- 15:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1251 gradually with 4 steps - Pool db1251.eqiad.wmnet in after cloning
- 15:48 moritzm: upgrade Envoy on Yarn T405808
- 15:45 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1088.eqiad.wmnet with OS bullseye
- 15:29 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 15:26 mvernon@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1088.eqiad.wmnet with reason: host reimage
- 15:13 moritzm: upgrade Envoy on Turnilo T405808
- 15:12 mvernon@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1088.eqiad.wmnet with OS bullseye
- 14:51 Lucas_WMDE: UTC afternoon backport+config window done
- 14:47 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] Enable Add Link for 3 wikis (T407818) (duration: 07m 46s)
- 14:43 urbanecm@deploy2002: urbanecm: Continuing with sync
- 14:41 urbanecm@deploy2002: urbanecm: Backport for [Growth] Enable Add Link for 3 wikis (T407818) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1198 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86314 and previous config saved to /var/cache/conftool/dbconfig/20251202-144148-marostegui.json
- 14:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86313 and previous config saved to /var/cache/conftool/dbconfig/20251202-144123-marostegui.json
- 14:39 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] Enable Add Link for 3 wikis (T407818)
- 14:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:30 derick@deploy2002: Finished scap sync-world: Backport for user: Mark users created with User::addToDatabase() as primary (T410652) (duration: 08m 34s)
- 14:28 ayounsi@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P86312 and previous config saved to /var/cache/conftool/dbconfig/20251202-142616-marostegui.json
- 14:26 derick@deploy2002: d3r1ck01, derick: Continuing with sync
- 14:25 derick@deploy2002: d3r1ck01, derick: Backport for user: Mark users created with User::addToDatabase() as primary (T410652) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:21 derick@deploy2002: Started scap sync-world: Backport for user: Mark users created with User::addToDatabase() as primary (T410652)
- 14:21 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:18 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Growth: Enable Revise Tone feature on pilot wikis (T409606) (duration: 13m 03s)
- 14:14 ayounsi@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:13 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, migr: Continuing with sync
- 14:12 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:11 ayounsi@cumin1003: START - Cookbook sre.hosts.provision for host ganeti-test2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P86311 and previous config saved to /var/cache/conftool/dbconfig/20251202-141108-marostegui.json
- 14:11 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on ganeti-test2001.codfw.wmnet with reason: test CR1207804
- 14:10 jgleeson: payments-wiki upgraded from b405d6db to 6d39e545
- 14:07 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, migr: Backport for Growth: Enable Revise Tone feature on pilot wikis (T409606) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:05 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Growth: Enable Revise Tone feature on pilot wikis (T409606)
- 13:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1251 - Depool db1251.eqiad.wmnet to then clone it to db1169.eqiad.wmnet - marostegui@cumin1003
- 13:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool db1251 - Depool db1251.eqiad.wmnet to then clone it to db1169.eqiad.wmnet - marostegui@cumin1003
- 13:58 marostegui@cumin1003: START - Cookbook sre.mysql.clone of db1251.eqiad.wmnet onto db1169.eqiad.wmnet
- 13:57 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
- 13:56 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
- 13:56 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
- 13:56 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
- 13:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86309 and previous config saved to /var/cache/conftool/dbconfig/20251202-135600-marostegui.json
- 13:55 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 13:54 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1169.eqiad.wmnet with OS bookworm
- 13:07 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply
- 13:07 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply
- 13:04 brouberol: running rebalancing of kafka-main-codfw with throttle of 30MB/s - T407185
- 13:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
- 12:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
- 12:46 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2193 (T410589)', diff saved to https://phabricator.wikimedia.org/P86308 and previous config saved to /var/cache/conftool/dbconfig/20251202-124632-ladsgroup.json
- 12:46 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
- 12:46 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T410589)', diff saved to https://phabricator.wikimedia.org/P86307 and previous config saved to /var/cache/conftool/dbconfig/20251202-124609-ladsgroup.json
- 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS bookworm
- 12:41 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1169.eqiad.wmnet with OS bookworm
- 12:40 kharlan@deploy2002: Finished scap sync-world: Backport for SI: Skip successfuledit event for null edits (T410280), SI: Skip successfuledit event for null edits (T410280) (duration: 06m 39s)
- 12:36 kharlan@deploy2002: kharlan: Continuing with sync
- 12:35 kharlan@deploy2002: kharlan: Backport for SI: Skip successfuledit event for null edits (T410280), SI: Skip successfuledit event for null edits (T410280) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:33 kharlan@deploy2002: Started scap sync-world: Backport for SI: Skip successfuledit event for null edits (T410280), SI: Skip successfuledit event for null edits (T410280)
- 12:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P86305 and previous config saved to /var/cache/conftool/dbconfig/20251202-123102-ladsgroup.json
- 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS bookworm
- 12:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P86304 and previous config saved to /var/cache/conftool/dbconfig/20251202-121554-ladsgroup.json
- 12:04 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 12:04 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 12:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T410589)', diff saved to https://phabricator.wikimedia.org/P86303 and previous config saved to /var/cache/conftool/dbconfig/20251202-120046-ladsgroup.json
- 11:57 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 11:56 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 11:44 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 11:44 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 11:41 kharlan@deploy2002: Finished scap sync-world: Backport for wgAutoConfirmCount: Raise value to 10 for frwiki, idwiki, trwiki (T411263) (duration: 08m 28s)
- 11:37 Emperor: rebuild RAID on ms-fe2014 T410959
- 11:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2149 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86302 and previous config saved to /var/cache/conftool/dbconfig/20251202-113625-marostegui.json
- 11:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 11:35 kharlan@deploy2002: kharlan: Continuing with sync
- 11:34 kharlan@deploy2002: kharlan: Backport for wgAutoConfirmCount: Raise value to 10 for frwiki, idwiki, trwiki (T411263) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 11:32 kharlan@deploy2002: Started scap sync-world: Backport for wgAutoConfirmCount: Raise value to 10 for frwiki, idwiki, trwiki (T411263)
- 11:16 kharlan@deploy2002: Finished scap sync-world: Backport for hCaptcha: Switch frwiki to 99.9% passive mode (T405586), hCaptcha: Enable hCaptcha editing in 100% passive mode on enwiki (T405586) (duration: 08m 55s)
- 11:12 kharlan@deploy2002: kharlan: Continuing with sync
- 11:10 kharlan@deploy2002: kharlan: Backport for hCaptcha: Switch frwiki to 99.9% passive mode (T405586), hCaptcha: Enable hCaptcha editing in 100% passive mode on enwiki (T405586) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 11:07 kharlan@deploy2002: Started scap sync-world: Backport for hCaptcha: Switch frwiki to 99.9% passive mode (T405586), hCaptcha: Enable hCaptcha editing in 100% passive mode on enwiki (T405586)
- 10:51 moritzm: rebuild software raid following disk swap on bast2003 T410195
- 10:41 bwojtowicz@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 10:38 elukey: upgrade spicerack to 12.1.0 on all cumin hosts
- 10:36 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host sretest1005.eqiad.wmnet
- 10:36 kharlan@deploy2002: Finished scap sync-world: Backport for UserInfoCard: Hide activity graph when it's likely to be inaccurate (T400409) (duration: 10m 26s)
- 10:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 10:33 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 10:33 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 10:32 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 10:32 kharlan@deploy2002: kharlan: Continuing with sync
- 10:31 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 10:29 bwojtowicz@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 10:27 kharlan@deploy2002: kharlan: Backport for UserInfoCard: Hide activity graph when it's likely to be inaccurate (T400409) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:25 kharlan@deploy2002: Started scap sync-world: Backport for UserInfoCard: Hide activity graph when it's likely to be inaccurate (T400409)
- 10:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2223 gradually with 4 steps - After switchover
- 10:21 kharlan@deploy2002: Finished scap sync-world: Backport for Allow similar signals to be merged into an existing case (T410303) (duration: 07m 52s)
- 10:17 kharlan@deploy2002: kharlan: Continuing with sync
- 10:15 kharlan@deploy2002: kharlan: Backport for Allow similar signals to be merged into an existing case (T410303) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:13 kharlan@deploy2002: Started scap sync-world: Backport for Allow similar signals to be merged into an existing case (T410303)
- 10:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 10:04 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 09:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2192.codfw.wmnet with reason: Maintenance
- 09:53 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2223 gradually with 4 steps - After switchover
- 09:53 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) db2223 gradually with 4 steps - After switchover
- 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2223 gradually with 4 steps - After switchover
- 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 09:50 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 09:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86298 and previous config saved to /var/cache/conftool/dbconfig/20251202-094931-marostegui.json
- 09:46 elukey: uploaded spicerack_12.1.0 to apt.wikimedia.org bullseye-wikimedia,bookworm-wikimedia
- 09:43 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 09:43 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 09:43 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 09:42 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 09:41 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 09:41 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 09:38 moritzm: upgrade Envoy on parsoidtest/testreduce T405808
- 09:09 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.5 refs T408275
- 09:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1189 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86297 and previous config saved to /var/cache/conftool/dbconfig/20251202-090932-marostegui.json
- 09:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
- 09:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86296 and previous config saved to /var/cache/conftool/dbconfig/20251202-090908-marostegui.json
- 09:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2223 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86295 and previous config saved to /var/cache/conftool/dbconfig/20251202-090334-marostegui.json
- 09:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2223.codfw.wmnet with reason: Maintenance
- 09:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86294 and previous config saved to /var/cache/conftool/dbconfig/20251202-090321-marostegui.json
- 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P86293 and previous config saved to /var/cache/conftool/dbconfig/20251202-085401-marostegui.json
- 08:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P86292 and previous config saved to /var/cache/conftool/dbconfig/20251202-084813-marostegui.json
- 08:40 gehel: restarting wdqs@codfw - system overloaded
- 08:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P86291 and previous config saved to /var/cache/conftool/dbconfig/20251202-083853-marostegui.json
- 08:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P86290 and previous config saved to /var/cache/conftool/dbconfig/20251202-083306-marostegui.json
- 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86289 and previous config saved to /var/cache/conftool/dbconfig/20251202-082345-marostegui.json
- 08:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86288 and previous config saved to /var/cache/conftool/dbconfig/20251202-081758-marostegui.json
- 08:17 dcausse: closing the utc morning backport window
- 08:14 dcausse@deploy2002: Finished scap sync-world: Backport for cirrus: enable georgian transliteration second try profile (T408737) (duration: 10m 00s)
- 08:09 dcausse@deploy2002: dcausse: Continuing with sync
- 08:06 dcausse@deploy2002: dcausse: Backport for cirrus: enable georgian transliteration second try profile (T408737) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:04 dcausse@deploy2002: Started scap sync-world: Backport for cirrus: enable georgian transliteration second try profile (T408737)
- 07:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2223.codfw.wmnet with reason: Schema change
- 07:35 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2180 (T410589)', diff saved to https://phabricator.wikimedia.org/P86287 and previous config saved to /var/cache/conftool/dbconfig/20251202-073553-ladsgroup.json
- 07:35 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
- 07:35 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T410589)', diff saved to https://phabricator.wikimedia.org/P86286 and previous config saved to /var/cache/conftool/dbconfig/20251202-073530-ladsgroup.json
- 07:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P86285 and previous config saved to /var/cache/conftool/dbconfig/20251202-072022-ladsgroup.json
- 07:05 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P86284 and previous config saved to /var/cache/conftool/dbconfig/20251202-070514-ladsgroup.json
- 06:50 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T410589)', diff saved to https://phabricator.wikimedia.org/P86283 and previous config saved to /var/cache/conftool/dbconfig/20251202-065007-ladsgroup.json
- 06:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2228.codfw.wmnet with reason: Schema change
- 05:59 kart_: Updated cxserver to 2025-12-02-041957-production + Yandex key removal from production config
- 05:59 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 05:57 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 05:52 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 05:52 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 05:50 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 05:49 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
- 05:20 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2213 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86282 and previous config saved to /var/cache/conftool/dbconfig/20251202-052010-marostegui.json
- 05:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
- 05:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86281 and previous config saved to /var/cache/conftool/dbconfig/20251202-051947-marostegui.json
- 05:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P86280 and previous config saved to /var/cache/conftool/dbconfig/20251202-050439-marostegui.json
- 05:02 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.2 (duration: 02m 56s)
- 04:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P86279 and previous config saved to /var/cache/conftool/dbconfig/20251202-044931-marostegui.json
- 04:48 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.5 refs T408275 (duration: 44m 45s)
- 04:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86278 and previous config saved to /var/cache/conftool/dbconfig/20251202-043424-marostegui.json
- 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.5 refs T408275
- 03:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86277 and previous config saved to /var/cache/conftool/dbconfig/20251202-035202-marostegui.json
- 03:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 03:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86276 and previous config saved to /var/cache/conftool/dbconfig/20251202-035138-marostegui.json
- 03:43 cstone: payments-wiki upgraded from c1b83aa2 to b405d6db
- 03:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P86275 and previous config saved to /var/cache/conftool/dbconfig/20251202-033630-marostegui.json
- 03:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P86274 and previous config saved to /var/cache/conftool/dbconfig/20251202-032122-marostegui.json
- 03:15 mutante: vrts1003 - compressed /opt/znuny-6.5.16 and .17 to .tar.gz files - then deleted uncompressed versions - freeing about 700k inodes (T411452)
- 03:14 mutante: vrts1003 - sudo -u otrs ./bin/otrs.Console.pl Maint::Cache::Delete (T411452)
- 03:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86273 and previous config saved to /var/cache/conftool/dbconfig/20251202-030615-marostegui.json
- 01:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86272 and previous config saved to /var/cache/conftool/dbconfig/20251202-013635-marostegui.json
- 01:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2211.codfw.wmnet with reason: Maintenance
- 00:05 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2169 (T410589)', diff saved to https://phabricator.wikimedia.org/P86271 and previous config saved to /var/cache/conftool/dbconfig/20251202-000540-ladsgroup.json
- 00:05 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 00:05 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T410589)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20251202-000512-ladsgroup.json
2025-12-01
- 23:50 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P86269 and previous config saved to /var/cache/conftool/dbconfig/20251201-235004-ladsgroup.json
- 23:45 catrope@deploy2002: Finished scap sync-world: Backport for Make sure WebAuthnKey::$supportsPasswordless is always initialized (T411368) (duration: 07m 36s)
- 23:41 catrope@deploy2002: catrope: Continuing with sync
- 23:39 catrope@deploy2002: catrope: Backport for Make sure WebAuthnKey::$supportsPasswordless is always initialized (T411368) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:38 catrope@deploy2002: Started scap sync-world: Backport for Make sure WebAuthnKey::$supportsPasswordless is always initialized (T411368)
- 23:34 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P86268 and previous config saved to /var/cache/conftool/dbconfig/20251201-233456-ladsgroup.json
- 23:19 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T410589)', diff saved to https://phabricator.wikimedia.org/P86267 and previous config saved to /var/cache/conftool/dbconfig/20251201-231949-ladsgroup.json
- 22:50 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
- 22:40 logmsgbot: mstyles Deployed security patch for T411144
- 22:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2201.codfw.wmnet with reason: Maintenance
- 22:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86266 and previous config saved to /var/cache/conftool/dbconfig/20251201-222810-marostegui.json
- 22:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1166 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86265 and previous config saved to /var/cache/conftool/dbconfig/20251201-222607-marostegui.json
- 22:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 22:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86264 and previous config saved to /var/cache/conftool/dbconfig/20251201-222544-marostegui.json
- 22:20 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on zuul2002.codfw.wmnet with reason: reboot
- 22:13 larssandergreen: civicrm upgraded from ee12d616 to c90bd037
- 22:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P86263 and previous config saved to /var/cache/conftool/dbconfig/20251201-221302-marostegui.json
- 22:11 dzahn@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host planet1004.eqiad.wmnet
- 22:11 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host planet1004.eqiad.wmnet with OS trixie
- 22:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P86262 and previous config saved to /var/cache/conftool/dbconfig/20251201-221036-marostegui.json
- 21:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P86261 and previous config saved to /var/cache/conftool/dbconfig/20251201-215754-marostegui.json
- 21:57 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on planet1004.eqiad.wmnet with reason: host reimage
- 21:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P86260 and previous config saved to /var/cache/conftool/dbconfig/20251201-215529-marostegui.json
- 21:52 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on planet1004.eqiad.wmnet with reason: host reimage
- 21:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86259 and previous config saved to /var/cache/conftool/dbconfig/20251201-214247-marostegui.json
- 21:42 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host planet1004.eqiad.wmnet with OS trixie
- 21:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86258 and previous config saved to /var/cache/conftool/dbconfig/20251201-214021-marostegui.json
- 21:37 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM planet1004.eqiad.wmnet - dzahn@cumin2002"
- 21:37 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM planet1004.eqiad.wmnet - dzahn@cumin2002"
- 21:36 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) planet1004.eqiad.wmnet on all recursors
- 21:36 dzahn@cumin2002: START - Cookbook sre.dns.wipe-cache planet1004.eqiad.wmnet on all recursors
- 21:36 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:36 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM planet1004.eqiad.wmnet - dzahn@cumin2002"
- 21:36 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM planet1004.eqiad.wmnet - dzahn@cumin2002"
- 21:36 bvibber@deploy2002: Finished scap sync-world: Backport for StickyHeaders: fix Minerva list styling for "peeking" bullet points (T409325) (duration: 07m 08s)
- 21:32 bvibber@deploy2002: bvibber: Continuing with sync
- 21:31 eileen: civicrm upgraded from 37ddffc2 to ee12d616
- 21:31 bvibber@deploy2002: bvibber: Backport for StickyHeaders: fix Minerva list styling for "peeking" bullet points (T409325) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:29 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 21:29 dzahn@cumin2002: START - Cookbook sre.ganeti.makevm for new host planet1004.eqiad.wmnet
- 21:29 bvibber@deploy2002: Started scap sync-world: Backport for StickyHeaders: fix Minerva list styling for "peeking" bullet points (T409325)
- 21:25 cscott@deploy2002: Finished scap sync-world: Backport for Deploy Parsoid Read Views to 19 wikis (T411283), Change the README to Markdown, noc: Point links in /conf to Gitiles rather than Differential, REST: enable the site.v1 module (T409516), cirrus: Apply increased near match weight on commonswiki (T408154) (duration: 12m
- 21:21 cscott@deploy2002: cscott, ebernhardson, tgr, arlolra, bpirkle: Continuing with sync
- {{safesubst:SAL entry|1=21:17 cscott@deploy2002: cscott, ebernhardson, tgr, arlolra, bpirkle: Backport for Deploy Parsoid Read Views to 19 wikis (T411283), Change the README to Markdown, noc: Point links in /conf to Gitiles rather than Differential, REST: enable the site.v1 module (T409516), [[gerrit:1213559|cirrus: Apply increased near match weight on commonswiki (T408154}}
- 21:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 21:16 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 21:13 cscott@deploy2002: Started scap sync-world: Backport for Deploy Parsoid Read Views to 19 wikis (T411283), Change the README to Markdown, noc: Point links in /conf to Gitiles rather than Differential, REST: enable the site.v1 module (T409516), cirrus: Apply increased near match weight on commonswiki (T408154)
- 21:03 ejegg: payments-wiki upgraded from bb179e9c to c1b83aa2
- 20:57 urbanecm@deploy2002: Finished scap sync-world: Backport for Introduce HTML confirmation email (T396155), ConfirmEmailHooks: Do not run when UserEmailConfirmationUseHTML is true (T396155) (duration: 36m 09s)
- 20:51 herron: prometheus100[78] grow /dev/vg0/prometheus-k8s-dse filesystems
- 20:44 urbanecm@deploy2002: urbanecm: Continuing with sync
- 20:44 urbanecm@deploy2002: urbanecm: Backport for Introduce HTML confirmation email (T396155), ConfirmEmailHooks: Do not run when UserEmailConfirmationUseHTML is true (T396155) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:37 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 20:26 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host sretest2001.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 20:20 urbanecm@deploy2002: Started scap sync-world: Backport for Introduce HTML confirmation email (T396155), ConfirmEmailHooks: Do not run when UserEmailConfirmationUseHTML is true (T396155)
- 20:13 jhathaway@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on sretest2001.codfw.wmnet with reason: T383173
- 20:10 taavi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-eqiad
- 20:09 taavi@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-eqiad
- 20:08 taavi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-eqiad
- 20:08 mutante: upgrading envoyproxy on contint1002; phab1004; T405808
- 20:04 taavi@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad
- 20:04 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2178 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86256 and previous config saved to /var/cache/conftool/dbconfig/20251201-200359-marostegui.json
- 20:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 20:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86255 and previous config saved to /var/cache/conftool/dbconfig/20251201-200335-marostegui.json
- 20:02 mutante: updating envoyproxy from 1.29.x to 1.32.x on phabricator prod host
- 19:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs6003*} and A:liberica
- 19:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P86254 and previous config saved to /var/cache/conftool/dbconfig/20251201-194828-marostegui.json
- 19:46 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6003*} and A:liberica
- 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P86253 and previous config saved to /var/cache/conftool/dbconfig/20251201-193320-marostegui.json
- 19:28 cdobbins@cumin2002: END (FAIL) - Cookbook sre.loadbalancer.admin (exit_code=1) rebooting P{lvs6003*} and A:liberica
- 19:25 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6003*} and A:liberica
- 19:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86252 and previous config saved to /var/cache/conftool/dbconfig/20251201-191812-marostegui.json
- 19:14 cdobbins@cumin2002: END (FAIL) - Cookbook sre.loadbalancer.admin (exit_code=1) rebooting P{lvs6003*} and A:liberica
- 19:11 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6003*} and A:liberica
- 19:03 cdobbins@cumin2002: END (FAIL) - Cookbook sre.loadbalancer.admin (exit_code=1) rebooting P{lvs6003*} and A:liberica
- 19:00 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6003*} and A:liberica
- 18:44 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudweb1003.wikimedia.org with OS trixie
- 18:24 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudweb1003.wikimedia.org with reason: host reimage
- 18:18 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudweb1003.wikimedia.org with reason: host reimage
- 18:05 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudweb1003.wikimedia.org with OS trixie
- 18:03 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 18:02 taavi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-eqiad
- 18:01 taavi@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad
- 18:00 taavi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic-eqiad
- 17:59 taavi@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic-eqiad
- 17:56 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 17:45 taavi@cumin1003: conftool action : set/pooled=no; selector: cluster=cloudweb,name=cloudweb1003.wikimedia.org
- 17:43 taavi@cumin1003: conftool action : set/pooled=inactive; selector: cluster=cloudweb,name=cloudweb1003.wikimedia.org
- 17:39 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudweb1003.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 17:39 bd808@deploy2002: Finished scap sync-world: Backport for labswiki: Enable sitenotice on mobile (T410702) (duration: 06m 49s)
- 17:39 tappof: "thanos-store: set cutoff days to 1" reverted on titan2001 (4/4) T410152
- 17:35 bd808@deploy2002: bd808: Continuing with sync
- 17:34 bd808@deploy2002: bd808: Backport for labswiki: Enable sitenotice on mobile (T410702) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:32 bd808@deploy2002: Started scap sync-world: Backport for labswiki: Enable sitenotice on mobile (T410702)
- 17:32 andrew@cumin2002: START - Cookbook sre.hosts.provision for host cloudweb1003.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 17:31 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudweb1004.wikimedia.org with OS trixie
- 17:17 tappof: "thanos-store: set cutoff days to 1" reverted on titan2002 (3/4) T410152
- 17:08 hnowlan@deploy2002: Finished deploy [restbase/deploy@19cb647]: Add new wikis to restbase T408352 T408344 (duration: 16m 16s)
- 16:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86251 and previous config saved to /var/cache/conftool/dbconfig/20251201-165902-marostegui.json
- 16:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 16:58 cdobbins@cumin2002: END (FAIL) - Cookbook sre.loadbalancer.admin (exit_code=1) rebooting P{lvs6003*} and A:liberica
- 16:55 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6003*} and A:liberica
- 16:52 hnowlan@deploy2002: Started deploy [restbase/deploy@19cb647]: Add new wikis to restbase T408352 T408344
- 16:48 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudweb1004.wikimedia.org with reason: host reimage
- 16:43 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudweb1004.wikimedia.org with reason: host reimage
- 16:31 Emperor: depool ms-fe2014 for disk swap T410959
- 16:31 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudweb1004.wikimedia.org with OS trixie
- 16:30 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudweb1004.wikimedia.org with OS trixie
- 16:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2158 (T410589)', diff saved to https://phabricator.wikimedia.org/P86250 and previous config saved to /var/cache/conftool/dbconfig/20251201-162923-ladsgroup.json
- 16:29 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 16:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T410589)', diff saved to https://phabricator.wikimedia.org/P86249 and previous config saved to /var/cache/conftool/dbconfig/20251201-162900-ladsgroup.json
- 16:28 tappof: "thanos-store: set cutoff days to 1" reverted on titan1002 (2/4) T410152
- 16:20 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1187 gradually with 4 steps - After schema change
- 16:13 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P86247 and previous config saved to /var/cache/conftool/dbconfig/20251201-161352-ladsgroup.json
- 16:00 taavi@dns1004: END - running authdns-update
- 15:59 taavi@dns1004: START - running authdns-update
- 15:58 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P86245 and previous config saved to /var/cache/conftool/dbconfig/20251201-155844-ladsgroup.json
- 15:56 tappof: "thanos-store: set cutoff days to 1" reverted on titan1001 (1/4) T410152
- 15:56 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudweb1004.wikimedia.org with OS trixie
- 15:56 tappof: "thanos-store: set cutoff days to 1" reverted on titan1001 (1/4)
- 15:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2171 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86244 and previous config saved to /var/cache/conftool/dbconfig/20251201-155606-marostegui.json
- 15:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 15:55 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudweb1004.wikimedia.org with OS trixie
- 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86243 and previous config saved to /var/cache/conftool/dbconfig/20251201-155542-marostegui.json
- 15:50 inflatador: bking@wmf3062 restart wdqs codfw for high lag https://docs.google.com/spreadsheets/d/1UaabYlqj37EEaLAkrRArn4yNuNviGObgsGTfquIIHAQ/edit?gid=0#gid=0
- 15:50 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudweb1004.wikimedia.org with OS trixie
- 15:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1032.eqiad.wmnet with OS bookworm
- 15:43 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T410589)', diff saved to https://phabricator.wikimedia.org/P86241 and previous config saved to /var/cache/conftool/dbconfig/20251201-154337-ladsgroup.json
- 15:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P86240 and previous config saved to /var/cache/conftool/dbconfig/20251201-154035-marostegui.json
- 15:34 marostegui@cumin1003: START - Cookbook sre.mysql.pool db1187 gradually with 4 steps - After schema change
- 15:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P86238 and previous config saved to /var/cache/conftool/dbconfig/20251201-152527-marostegui.json
- 15:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
- 15:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
- 15:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
- 15:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1032.eqiad.wmnet with reason: host reimage
- 15:19 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp2043.codfw.wmnet with OS trixie
- 15:15 Lucas_WMDE: UTC afternoon backport+config window done
- 15:12 kharlan@deploy2002: Finished scap sync-world: Backport for EventLogging: Register mediawiki.hcaptcha.edit stream (T406865), Set new $wgRateLimits config for edit attempt log (T406865) (duration: 11m 03s)
- 15:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86237 and previous config saved to /var/cache/conftool/dbconfig/20251201-151019-marostegui.json
- 15:07 kharlan@deploy2002: kharlan, sguebo: Continuing with sync
- 15:03 kharlan@deploy2002: kharlan, sguebo: Backport for EventLogging: Register mediawiki.hcaptcha.edit stream (T406865), Set new $wgRateLimits config for edit attempt log (T406865) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudweb1004.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 15:01 kharlan@deploy2002: Started scap sync-world: Backport for EventLogging: Register mediawiki.hcaptcha.edit stream (T406865), Set new $wgRateLimits config for edit attempt log (T406865)
- 14:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS bookworm
- 14:55 andrew@cumin2002: START - Cookbook sre.hosts.provision for host cloudweb1004.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 14:54 esanders@deploy2002: Finished scap sync-world: Backport for FlowMoveBoardsToSubpages: Add 'title' option for moving a specific board (T402552) (duration: 06m 31s)
- 14:50 esanders@deploy2002: esanders: Continuing with sync
- 14:49 esanders@deploy2002: esanders: Backport for FlowMoveBoardsToSubpages: Add 'title' option for moving a specific board (T402552) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:47 esanders@deploy2002: Started scap sync-world: Backport for FlowMoveBoardsToSubpages: Add 'title' option for moving a specific board (T402552)
- 14:46 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for CentralAuthUser: Cache getLocalGroups() (T410878) (duration: 14m 51s)
- 14:42 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, matmarex: Continuing with sync
- 14:37 slyngshede@dns1004: END - running authdns-update
- 14:36 slyngshede@dns1004: START - running authdns-update
- 14:33 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, matmarex: Backport for CentralAuthUser: Cache getLocalGroups() (T410878) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:31 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for CentralAuthUser: Cache getLocalGroups() (T410878)
- 14:30 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Api: Initialise reference variable (T411075) (duration: 07m 04s)
- 14:28 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS trixie
- 14:26 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, matmarex: Continuing with sync
- 14:25 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, matmarex: Backport for Api: Initialise reference variable (T411075) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:23 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Api: Initialise reference variable (T411075)
- 14:17 mfossati@deploy2002: Finished scap sync-world: Backport for ReaderExperiments' StickyHeaders stream configuration (T410533) (duration: 11m 51s)
- 14:11 mfossati@deploy2002: mfossati: Continuing with sync
- 14:09 mfossati@deploy2002: mfossati: Backport for ReaderExperiments' StickyHeaders stream configuration (T410533) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:05 mfossati@deploy2002: Started scap sync-world: Backport for ReaderExperiments' StickyHeaders stream configuration (T410533)
- 13:43 dcausse: T408431: reindexing all wikis in codfw
- 13:42 moritzm: upgrade Envoy on deployment servers T405808
- 13:16 moritzm: imported rancid 3.13-2+wmf12u1 for bookworm-wikimedia and 3.14-1+wmf13u1 for trixie-wikimedia T410606
- 12:58 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1013
- 12:53 elukey@cumin2002: START - Cookbook sre.hosts.powercycle for host ml-serve1013
- 12:47 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1013.eqiad.wmnet with OS trixie
- 11:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 11:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1013.eqiad.wmnet with reason: host reimage
- 11:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86236 and previous config saved to /var/cache/conftool/dbconfig/20251201-114902-marostegui.json
- 11:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 11:47 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1013.eqiad.wmnet with reason: host reimage
- 11:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1013.eqiad.wmnet with OS trixie
- 11:29 btullis: restarting envoyproxy process on cephosd100[1-5] for T405808
- 11:28 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-serve1013.eqiad.wmnet with OS trixie
- 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet
- 11:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1010.eqiad.wmnet
- 11:02 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1013.eqiad.wmnet with OS trixie
- 10:52 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1013
- 10:51 JavierMonton: Deployed refinery using scap, then deployed onto hdfs
- 10:47 moritzm: upgrade Envoy on matomo1001 T405808
- 10:47 elukey@cumin2002: START - Cookbook sre.hosts.powercycle for host ml-serve1013
- 10:46 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 10:46 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
- 10:42 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-serve1013.eqiad.wmnet with OS trixie
- 10:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1013.eqiad.wmnet with OS trixie
- 10:39 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-serve1013.eqiad.wmnet with OS trixie
- 10:23 javiermonton@deploy2002: Finished deploy [analytics/refinery@fa63f82]: Regular analytics train [analytics/refinery@fa63f82e] (duration: 00m 28s)
- 10:23 javiermonton@deploy2002: Started deploy [analytics/refinery@fa63f82]: Regular analytics train [analytics/refinery@fa63f82e]
- 10:20 a-pizzata@deploy2002: Finished deploy [analytics/refinery@fa63f82]: Regular analytics train [analytics/refinery@fa63f82e] (duration: 02m 54s)
- 10:17 a-pizzata@deploy2002: Started deploy [analytics/refinery@fa63f82]: Regular analytics train [analytics/refinery@fa63f82e]
- 10:16 a-pizzata@deploy2002: Finished deploy [analytics/refinery@fa63f82] (hadoop-test): Analytics train TEST [analytics/refinery@fa63f82e] (duration: 01m 08s)
- 10:15 a-pizzata@deploy2002: Started deploy [analytics/refinery@fa63f82] (hadoop-test): Analytics train TEST [analytics/refinery@fa63f82e]
- 10:14 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1013.eqiad.wmnet with OS trixie
- 10:13 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 10:13 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 10:12 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 10:11 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
- 10:11 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:11 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change ml-serve1013 vlan - ayounsi@cumin1003"
- 10:11 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change ml-serve1013 vlan - ayounsi@cumin1003"
- 10:04 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors (exit_code=0) rolling restart_daemons on A:logstash-collector
- 09:53 taavi@dns1004: END - running authdns-update
- 09:53 jmm@cumin2002: START - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors rolling restart_daemons on A:logstash-collector
- 09:52 taavi@dns1004: START - running authdns-update
- 09:39 moritzm: installing expat security updates
- 09:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 09:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:58 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2151 (T410589)', diff saved to https://phabricator.wikimedia.org/P86235 and previous config saved to /var/cache/conftool/dbconfig/20251201-085828-ladsgroup.json
- 08:58 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 08:50 moritzm: upgrade Envoy on config-master* T405808
- 08:33 mszwarc@deploy2002: Finished scap sync-world: Backport for Fix mw-userlink class being added too broadly (T392775) (duration: 38m 35s)
- 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:19 mszwarc@deploy2002: mszwarc: Continuing with sync
- 08:19 brouberol@dns1004: END - running authdns-update
- 08:18 mszwarc@deploy2002: mszwarc: Backport for Fix mw-userlink class being added too broadly (T392775) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:18 brouberol@dns1004: START - running authdns-update
- 07:55 mszwarc@deploy2002: Started scap sync-world: Backport for Fix mw-userlink class being added too broadly (T392775)
- 06:47 eileen: civicrm upgraded from 1fc76c13 to 37ddffc2
- 06:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 05:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2212.codfw.wmnet with reason: Maintenance
- 05:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1163.eqiad.wmnet with reason: Maintenance
- 04:53 eileen: civicrm upgraded from 6c200f91 to 1fc76c13
- 03:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 03:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86234 and previous config saved to /var/cache/conftool/dbconfig/20251201-033910-marostegui.json
- 03:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P86233 and previous config saved to /var/cache/conftool/dbconfig/20251201-032402-marostegui.json
- 03:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P86232 and previous config saved to /var/cache/conftool/dbconfig/20251201-030855-marostegui.json
- 02:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86231 and previous config saved to /var/cache/conftool/dbconfig/20251201-025347-marostegui.json
- 01:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1230 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P86230 and previous config saved to /var/cache/conftool/dbconfig/20251201-012716-marostegui.json
- 01:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
- 01:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 12m 34s)
- 01:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
- 00:22 eileen: civicrm upgraded from 4437a5ef to 6c200f91
2000s
- Archive 1: 2004 Jun - 2004 Sep
- Archive 2: 2004 Oct - 2004 Nov
- Archive 3: 2004 Dec - 2005 Mar
- Archive 4: 2005 Apr - 2005 Jul
- Archive 5: 2005 Aug - 2005 Oct, with revision history 2004-06-23 to 2005-11-25
- Archive 6: 2005 Nov - 2006 Feb
- Archive 7: 2006 Mar - 2006 Jun
- Archive 8: 2006 Jul - 2006 Sep
- Archive 9: 2006 Oct - 2007 Jan, with revision history 2005-11-25 to 2007-02-21
- Archive 10: 2007 Feb - 2007 Jun
- Archive 11: 2007 Jul - 2007 Dec
- Archive 12: 2008 Jan - 2008 Jul
- Archive 12a: 2008 Aug
- Archive 12b: 2008 Sept
- Archive 13: 2008 Oct - 2009 Jun
- Archive 14: 2009 Jun - 2009 Dec
2010s
- Archive 15: 2010 Jan - 2010 Jun
- Archive 16: 2010 Jul - 2010 Oct
- Archive 17: 2010 Nov - 2010 Dec
- Archive 18: 2011 Jan - 2011 Jun
- Archive 19: 2011 Jul - 2011 Dec
- Archive 20: 2011 Dec - 2012 Jun, with revision history 2007-02-21 to 2012-03-27
- Archive 21: 2012 Jul - 2013 Jan
- Archive 22: 2013 Jan - 2013 Jul
- Archive 23: 2013 Aug - 2013 Dec
- Archive 24: 2014 Jan - 2014 Mar
- Archive 25: 2014 April - 2014 September
- Archive 26: 2014 October - 2014 December
- Archive 27: 2015 January - 2015 July
- Archive 28: 2015 August - 2015 December
- Archive 29: 2016 January - 2016 May
- Archive 30: 2016 June - 2016 August
- Archive 31: 2016 September - 2016 December
- Archive 32: 2017 January - 2017 July
- Archive 33: 2017 August - 2017 December
- Archive 34: 2018 January - 2018 April
- Archive 35: 2018 May - 2018 August
- Archive 36: 2018 September - 2018 December
- Archive 37: 2019 January - 2019 April
- Archive 38: 2019 May - 2019 August
- Archive 39: 2019 September - 2019 December
2020-2024
- Archive 40: 2020 January - 2020 April
- Archive 41: 2020 May - 2020 July
- Archive 42: 2020 August - 2020 November
- Archive 43: 2020 December
- Archive 44: 2021 January - 2021 April
- Archive 45: 2021 May - 2021 July
- Archive 46: 2021 August - 2021 October
- Archive 47: 2021 November - 2021 December
- Archive 48: 2022 January
- Archive 49: 2022 February
- Archive 50: 2022 March
- Archive 51: 2022 April 1-15
- Archive 52: 2022 April 16-30
- Archive 53: 2022 May
- Archive 54: 2022 June
- Archive 55: 2022 July
- Archive 56: 2022 August
- Archive 57: 2022 September
- Archive 58: 2022 October
- Archive 59: 2022 November 1-15
- Archive 60: 2022 November 16-30
- Archive 61: 2022 December
- Archive 62: 2023 January
- Archive 63: 2023 February
- Archive 64: 2023 March
- Archive 65: 2023 April
- Archive 66: 2023 May
- Archive 67: 2023 June
- Archive 68: 2023 July
- Archive 69: 2023 August 1-15
- Archive 70: 2023 August 16-31
- Archive 71: 2023 September
- Archive 72: 2023 October
- Archive 73: 2023 November
- Archive 74: 2023 December
- Archive 75: 2024 January
- Archive 76: 2024 February
- Archive 77: 2024 March
- Archive 78: 2024 April
- Archive 79: 2024 May 1-15
- Archive 80: 2024 May 16-31
- Archive 81: 2024 June 1-15
- Archive 82: 2024 June 16-30
- Archive 83: 2024 July
- Archive 84: 2024 August
- Archive 85: 2024 September
- Archive 86: 2024 October
- Archive 87: 2024 November
- Archive 88: 2024 December