Jump to content

Server Admin Log

From Wikitech
(Redirected from Server admin log)

2025-02-13

  • 16:29 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 16:29 bking@cumin2002: START - Cookbook sre.hosts.rename from elastic1106 to relforge1007
  • 16:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from elastic1105 to relforge1006
  • 16:28 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1006
  • 16:27 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1006
  • 16:27 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:27 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming elastic1105 to relforge1006 - bking@cumin2002"
  • 16:27 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming elastic1105 to relforge1006 - bking@cumin2002"
  • 16:23 cgoubert@deploy2002: Started deploy [restbase/deploy@511b3a4]: Add kncwiki (T385186)
  • 16:22 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 16:22 bking@cumin2002: START - Cookbook sre.hosts.rename from elastic1105 to relforge1006
  • 16:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from elastic1104 to relforge1005
  • 16:20 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1007.eqiad.wmnet with OS bookworm
  • 16:20 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1005
  • 16:20 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1005
  • 16:20 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:20 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming elastic1104 to relforge1005 - bking@cumin2002"
  • 16:19 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming elastic1104 to relforge1005 - bking@cumin2002"
  • 16:16 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 16:15 bking@cumin2002: START - Cookbook sre.hosts.rename from elastic1104 to relforge1005
  • 16:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 15:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2243.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:35 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2243.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:29 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Add config option to make somevalue hashes use URI (T384344), Make somevalue hashes use URI in tests (T384344), Add config option to fix s:, ref:, v: namespace prefix (T384344), Fix s:, ref:, v: namespace prefix in tests (T384344) (duration: 11m 19s)
  • 15:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2243.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:23 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2243.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:22 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Continuing with sync
  • 15:20 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Add config option to make somevalue hashes use URI (T384344), Make somevalue hashes use URI in tests (T384344), Add config option to fix s:, ref:, v: namespace prefix (T384344), Fix s:, ref:, v: namespace prefix in tests (T384344) synced to the testservers (https://wikitech.
  • 15:19 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: maintenance
  • 15:19 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic[1104-1106].eqiad.wmnet with reason: T386357
  • 15:18 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2146.codfw.wmnet
  • 15:18 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Add config option to make somevalue hashes use URI (T384344), Make somevalue hashes use URI in tests (T384344), Add config option to fix s:, ref:, v: namespace prefix (T384344), Fix s:, ref:, v: namespace prefix in tests (T384344)
  • 15:17 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2243.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:15 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1006.eqiad.wmnet with OS bookworm
  • 15:15 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1219.eqiad.wmnet with reason: Index rebuild
  • 15:13 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1219.eqiad.wmnet
  • 15:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2243.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:12 marostegui@cumin1002: START - Cookbook sre.mysql.upgrade for db2146.codfw.wmnet
  • 15:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2146', diff saved to https://phabricator.wikimedia.org/P73463 and previous config saved to /var/cache/conftool/dbconfig/20250213-151117-marostegui.json
  • 15:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2243.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:07 marostegui@cumin1002: START - Cookbook sre.mysql.upgrade for db1219.eqiad.wmnet
  • 15:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1219', diff saved to https://phabricator.wikimedia.org/P73462 and previous config saved to /var/cache/conftool/dbconfig/20250213-150715-marostegui.json
  • 15:05 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2243.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:03 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic1104*,elastic1105*,elastic1106* for ban hosts prior to reimage/repurpose - bking@cumin2002 - T386357
  • 15:03 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic1104*,elastic1105*,elastic1106* for ban hosts prior to reimage/repurpose - bking@cumin2002 - T386357
  • 15:03 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host puppetserver2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 15:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host puppetserver2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 14:57 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1006.eqiad.wmnet with reason: host reimage
  • 14:53 stevemunene@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1006.eqiad.wmnet with reason: host reimage
  • 14:35 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1006.eqiad.wmnet with OS bookworm
  • 14:16 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic1104*,elastic1005*,elastic1006* for ban hosts prior to reimage/repurpose - bking@cumin2002 - T386357
  • 14:16 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic1104*,elastic1005*,elastic1006* for ban hosts prior to reimage/repurpose - bking@cumin2002 - T386357
  • 14:06 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1005.eqiad.wmnet with OS bookworm
  • 14:00 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:59 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:48 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1005.eqiad.wmnet with reason: host reimage
  • 13:48 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:46 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:45 stevemunene@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1005.eqiad.wmnet with reason: host reimage
  • 13:27 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1005.eqiad.wmnet with OS bookworm
  • 13:04 kart_: Updated Cxserver to 2025-02-13-102531-production (T381943, T386231)
  • 13:03 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 13:02 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1004.eqiad.wmnet with OS bookworm
  • 13:02 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 13:01 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 13:01 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 12:56 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 12:56 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 12:45 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1004.eqiad.wmnet with reason: host reimage
  • 12:41 stevemunene@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1004.eqiad.wmnet with reason: host reimage
  • 12:25 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1004.eqiad.wmnet with OS bookworm
  • 12:19 ladsgroup@deploy2002: Finished deploy [dumps/dumps@2e0a7a5]: Stop producing Yahoo! abstract dumps (T382069) (duration: 00m 07s)
  • 12:19 ladsgroup@deploy2002: Started deploy [dumps/dumps@2e0a7a5]: Stop producing Yahoo! abstract dumps (T382069)
  • 12:18 stevemunene: draining dse-k8s-worker1004 ready for reimage to bookworm and containerd for T377875
  • 12:04 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database kncwiki (T385188)
  • 11:38 fnegri@cumin1002: START - Cookbook sre.wikireplicas.add-wiki for database kncwiki (T385188)
  • 11:08 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1003.eqiad.wmnet with OS bookworm
  • 10:50 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1003.eqiad.wmnet with reason: host reimage
  • 10:49 aklapper@deploy2002: Finished scap sync-world: Backport for ApiPageTriageList: Check that $user is defined before using it (T386332) (duration: 10m 47s)
  • 10:48 joal@deploy2002: Finished deploy [analytics/refinery@08b2bd2] (hadoop-test): Analytics one-off deploy - TEST [analytics/refinery@08b2bd2e] (duration: 00m 44s)
  • 10:47 joal@deploy2002: Started deploy [analytics/refinery@08b2bd2] (hadoop-test): Analytics one-off deploy - TEST [analytics/refinery@08b2bd2e]
  • 10:47 stevemunene@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1003.eqiad.wmnet with reason: host reimage
  • 10:47 joal@deploy2002: Finished deploy [analytics/refinery@08b2bd2] (thin): Analytics one-off deploy -THIN [analytics/refinery@08b2bd2e] (duration: 00m 46s)
  • 10:46 joal@deploy2002: Started deploy [analytics/refinery@08b2bd2] (thin): Analytics one-off deploy -THIN [analytics/refinery@08b2bd2e]
  • 10:46 joal@deploy2002: Finished deploy [analytics/refinery@08b2bd2]: Analytics one-off deploy [analytics/refinery@08b2bd2e] (duration: 02m 07s)
  • 10:44 joal@deploy2002: Started deploy [analytics/refinery@08b2bd2]: Analytics one-off deploy [analytics/refinery@08b2bd2e]
  • 10:42 aklapper@deploy2002: kharlan, aklapper: Continuing with sync
  • 10:41 aklapper@deploy2002: kharlan, aklapper: Backport for ApiPageTriageList: Check that $user is defined before using it (T386332) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:38 aklapper@deploy2002: Started scap sync-world: Backport for ApiPageTriageList: Check that $user is defined before using it (T386332)
  • 10:14 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1003.eqiad.wmnet with OS bookworm
  • 09:40 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.44.0-wmf.16 refs T382367
  • 09:03 dcausse: closing UTC morning backport window
  • 09:01 dcausse@deploy2002: Finished scap sync-world: Backport for cirrus: enable mlr-2025 for select wikis (T385972) (duration: 19m 06s)
  • 08:54 dcausse@deploy2002: dcausse, gmodena: Continuing with sync
  • 08:45 dcausse@deploy2002: dcausse, gmodena: Backport for cirrus: enable mlr-2025 for select wikis (T385972) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:42 dcausse@deploy2002: Started scap sync-world: Backport for cirrus: enable mlr-2025 for select wikis (T385972)
  • 08:36 dcausse@deploy2002: Finished scap sync-world: Backport for Lift IP cap for edit-a-thon on 2025-02-17 & 2025-03-10 (T386126) (duration: 09m 45s)
  • 08:27 dcausse@deploy2002: Started scap sync-world: Backport for Lift IP cap for edit-a-thon on 2025-02-17 & 2025-03-10 (T386126)
  • 08:23 dcausse@deploy2002: Finished scap sync-world: Backport for Revert "zhwiki: Add 2025 CNY celebration logos" (duration: 13m 24s)
  • 08:17 dcausse@deploy2002: stang, dcausse: Continuing with sync
  • 08:13 dcausse@deploy2002: stang, dcausse: Backport for Revert "zhwiki: Add 2025 CNY celebration logos" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:10 dcausse@deploy2002: Started scap sync-world: Backport for Revert "zhwiki: Add 2025 CNY celebration logos"
  • 04:45 kart_: Updated cxserver to 2025-02-12-075258-production (T381943)
  • 04:41 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 04:40 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 04:38 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 04:37 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 04:28 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 04:28 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 03:40 tchin@deploy2002: Finished deploy [airflow-dags/analytics@aaba3ff]: Deploying airflow for T306896 (duration: 01m 07s)
  • 03:39 tchin@deploy2002: Started deploy [airflow-dags/analytics@aaba3ff]: Deploying airflow for T306896
  • 03:36 eileen: civicrm upgraded from c52e87d6 to a62ed046
  • 01:48 zabe: zabe@deploy2002:~$ mwscript-k8s --comment="T386292" -f -- extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=metawiki --logwiki=metawiki 'Nebuls' 'Renamed user 9b7b870ac2b7d3f071232203ec1030d1'
  • 01:48 zabe: zabe@deploy2002:~$ mwscript-k8s --comment="T386292" -f -- extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=loginwiki --logwiki=metawiki 'Sofia Baldelli' 'AnonymWikiuser 245'
  • 01:35 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=OGPawlis --overwrite /tmp/uploads # T382976
  • 00:15 zabe@deploy2002: Finished scap sync-world: Backport for Reduce revision-slots cache expiry to 60s on diqwiki and ttwiki (T183490) (duration: 10m 39s)
  • 00:08 zabe@deploy2002: zabe: Continuing with sync
  • 00:07 zabe@deploy2002: zabe: Backport for Reduce revision-slots cache expiry to 60s on diqwiki and ttwiki (T183490) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 00:04 zabe@deploy2002: Started scap sync-world: Backport for Reduce revision-slots cache expiry to 60s on diqwiki and ttwiki (T183490)
  • 00:03 eileen: civicrm upgraded from 454e0ccd to c52e87d6
  • 00:00 toyofuku@deploy2002: Finished scap sync-world: Backport for Lazy Load Images (T366402), Lazy Load Images (T366402) (duration: 31m 40s)

2025-02-12

  • 23:51 toyofuku@deploy2002: toyofuku: Continuing with sync
  • 23:34 toyofuku@deploy2002: toyofuku: Backport for Lazy Load Images (T366402), Lazy Load Images (T366402) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:28 toyofuku@deploy2002: Started scap sync-world: Backport for Lazy Load Images (T366402), Lazy Load Images (T366402)
  • 22:23 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host relforge1004.eqiad.wmnet
  • 22:07 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 22:06 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 22:06 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 22:05 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 22:04 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 22:04 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 22:02 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 21:31 kamila@cumin1002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1124-1128].eqiad.wmnet
  • 21:31 kamila@cumin1002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1124-1128].eqiad.wmnet
  • 21:14 cjming@deploy2002: Finished scap sync-world: Backport for [arwiki] Set noindex for namespace user talk (T371470) (duration: 11m 05s)
  • 21:07 cjming@deploy2002: cjming, gergesshamon: Continuing with sync
  • 21:06 cjming@deploy2002: cjming, gergesshamon: Backport for [arwiki] Set noindex for namespace user talk (T371470) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:03 cjming@deploy2002: Started scap sync-world: Backport for [arwiki] Set noindex for namespace user talk (T371470)
  • 20:30 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2243.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 20:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2243.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 20:14 marostegui@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73456 and previous config saved to /var/cache/conftool/dbconfig/20250212-201424-root.json
  • 20:01 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on relforge1004.eqiad.wmnet with reason: T380752
  • 19:59 marostegui@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73455 and previous config saved to /var/cache/conftool/dbconfig/20250212-195919-root.json
  • 19:44 marostegui@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73454 and previous config saved to /var/cache/conftool/dbconfig/20250212-194414-root.json
  • 19:29 marostegui@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73453 and previous config saved to /var/cache/conftool/dbconfig/20250212-192909-root.json
  • 19:27 marostegui@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73452 and previous config saved to /var/cache/conftool/dbconfig/20250212-192700-root.json
  • 19:14 marostegui@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73451 and previous config saved to /var/cache/conftool/dbconfig/20250212-191404-root.json
  • 19:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73450 and previous config saved to /var/cache/conftool/dbconfig/20250212-191155-root.json
  • 18:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73449 and previous config saved to /var/cache/conftool/dbconfig/20250212-185649-root.json
  • 18:53 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
  • 18:52 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
  • 18:51 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 18:51 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 18:50 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 18:50 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 18:47 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 18:47 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 18:47 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 18:46 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 18:45 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 18:45 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 18:44 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 18:44 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73448 and previous config saved to /var/cache/conftool/dbconfig/20250212-184143-root.json
  • 18:35 bking@cumin2002: START - Cookbook sre.hosts.dhcp for host relforge1004.eqiad.wmnet
  • 18:26 marostegui@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73447 and previous config saved to /var/cache/conftool/dbconfig/20250212-182637-root.json
  • 17:20 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1002.eqiad.wmnet with OS bookworm
  • 17:14 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
  • 17:14 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
  • 17:13 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 17:12 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 17:10 marostegui: Install 10.6.21 on db2230 T385678
  • 17:08 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2230.codfw.wmnet with reason: maintenance
  • 17:02 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1002.eqiad.wmnet with reason: host reimage
  • 16:59 stevemunene@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1002.eqiad.wmnet with reason: host reimage
  • 16:43 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1002.eqiad.wmnet with OS bookworm
  • 16:34 stevemunene@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1002.eqiad.wmnet with OS bookworm
  • 16:32 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 16:31 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 16:31 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 16:30 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 16:29 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 16:29 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 16:26 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 16:26 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 16:15 claime: Halving mw-api-int staging replicas to free pod ip blocks - T386107
  • 16:14 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-api-int: apply
  • 16:14 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/mw-api-int: apply
  • 16:09 claime: Deleting benthos, changeprop, changeprop-jobqueue from staging to free pod ip blocks - T386107
  • 16:07 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on cr2-magru with reason: IBGP instability from cr1 to cr2 in magru causing ping faulures from alert1002
  • 15:37 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Index rebuild
  • 15:36 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2176.codfw.wmnet
  • 15:32 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1002.eqiad.wmnet with OS bookworm
  • 15:31 stevemunene@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1002.eqiad.wmnet with OS bookworm
  • 15:30 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: maintenance
  • 15:29 marostegui@cumin1002: START - Cookbook sre.mysql.upgrade for db2176.codfw.wmnet
  • 15:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2176 T385561', diff saved to https://phabricator.wikimedia.org/P73446 and previous config saved to /var/cache/conftool/dbconfig/20250212-152738-marostegui.json
  • 15:25 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:23 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1232.eqiad.wmnet with reason: Index rebuild
  • 15:22 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1232.eqiad.wmnet
  • 15:18 Lucas_WMDE: UTC backport+config window done
  • 15:18 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1002.eqiad.wmnet with OS bookworm
  • 15:17 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
  • 15:16 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Let sysops add/remove the event-organizer group by default (T376822) (duration: 12m 53s)
  • 15:16 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1232.eqiad.wmnet with reason: maintenance
  • 15:15 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1232.eqiad.wmnet
  • 15:15 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1232', diff saved to https://phabricator.wikimedia.org/P73445 and previous config saved to /var/cache/conftool/dbconfig/20250212-151533-marostegui.json
  • 15:14 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:09 lucaswerkmeister-wmde@deploy2002: daimona, lucaswerkmeister-wmde: Continuing with sync
  • 15:07 lucaswerkmeister-wmde@deploy2002: daimona, lucaswerkmeister-wmde: Backport for Let sysops add/remove the event-organizer group by default (T376822) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:04 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:04 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Let sysops add/remove the event-organizer group by default (T376822)
  • 14:59 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 14:48 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for refactor(AddLink): Make eval steps more legible, feat(AddLink): store null if there is no recommendation (T382270) (duration: 11m 47s)
  • 14:41 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, migr: Continuing with sync
  • 14:39 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, migr: Backport for refactor(AddLink): Make eval steps more legible, feat(AddLink): store null if there is no recommendation (T382270) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:36 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for refactor(AddLink): Make eval steps more legible, feat(AddLink): store null if there is no recommendation (T382270)
  • 14:29 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for viwiki: Restrict the "changetags" permission to the sysop and bot groups (T385960), beta: fix typo in GEApiQueryGrowthTasksLookaheadSize variable (duration: 10m 59s)
  • 14:22 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, dragoniez, sgimeno: Continuing with sync
  • 14:21 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, dragoniez, sgimeno: Backport for viwiki: Restrict the "changetags" permission to the sysop and bot groups (T385960), beta: fix typo in GEApiQueryGrowthTasksLookaheadSize variable synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:18 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for viwiki: Restrict the "changetags" permission to the sysop and bot groups (T385960), beta: fix typo in GEApiQueryGrowthTasksLookaheadSize variable
  • 14:17 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Enable fixed Wikibase RDF on Beta (T384344) (duration: 10m 35s)
  • 14:10 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Continuing with sync
  • 14:09 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Enable fixed Wikibase RDF on Beta (T384344) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:06 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Enable fixed Wikibase RDF on Beta (T384344)
  • 13:49 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1001.eqiad.wmnet with OS bookworm
  • 13:40 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.16 refs T382367
  • 13:31 stevemunene@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1001.eqiad.wmnet with reason: host reimage
  • 13:25 stevemunene@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1001.eqiad.wmnet with reason: host reimage
  • 13:19 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
  • 13:18 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
  • 13:17 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 13:16 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 13:14 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
  • 13:13 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
  • 13:09 mszabo@deploy2002: Finished scap sync-world: Backport for Use original connection handle in onTransactionPreCommitOrIdle() (T386171) (duration: 11m 27s)
  • 13:09 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1001.eqiad.wmnet with OS bookworm
  • 13:08 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
  • 13:08 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
  • 13:07 stevemunene@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1001.eqiad.wmnet with OS bookworm
  • 13:06 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 13:06 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 13:04 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
  • 13:03 mszabo@deploy2002: mszabo: Continuing with sync
  • 13:02 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
  • 13:01 mszabo@deploy2002: mszabo: Backport for Use original connection handle in onTransactionPreCommitOrIdle() (T386171) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:58 mszabo@deploy2002: Started scap sync-world: Backport for Use original connection handle in onTransactionPreCommitOrIdle() (T386171)
  • 12:40 stevemunene@cumin1002: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1001.eqiad.wmnet with OS bookworm
  • 12:27 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
  • 12:27 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/zotero: apply
  • 12:26 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 12:25 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 12:23 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 12:22 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 12:14 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 12:14 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 12:13 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 12:13 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 12:09 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:08 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 10:30 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.44.0-wmf.16 refs T382367
  • 09:27 brouberol@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: test rolling-operation cookbook - brouberol@cumin2002
  • 09:27 brouberol@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: test rolling-operation cookbook - brouberol@cumin2002
  • 09:24 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.16 refs T382367
  • 09:18 brouberol@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: test rolling-operation cookbook - brouberol@cumin2002
  • 09:18 brouberol@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: test rolling-operation cookbook - brouberol@cumin2002
  • 09:16 brouberol@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: test rolling-operation cookbook - brouberol@cumin2002
  • 09:16 brouberol@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster relforge: test rolling-operation cookbook - brouberol@cumin2002
  • 08:49 dcausse: closing the UTC morning backport widow
  • 08:42 dcausse@deploy2002: Finished scap sync-world: Backport for cirrus: update ltr model on enwiki (T385972) (duration: 13m 10s)
  • 08:35 dcausse@deploy2002: gmodena, dcausse: Continuing with sync
  • 08:31 dcausse@deploy2002: gmodena, dcausse: Backport for cirrus: update ltr model on enwiki (T385972) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:28 dcausse@deploy2002: Started scap sync-world: Backport for cirrus: update ltr model on enwiki (T385972)
  • 08:25 dcausse@deploy2002: Finished scap sync-world: Backport for cirrus: create buckets for mlr 2025 experiment (T385972), cirrus: deploy new mlr models (T385972) (duration: 17m 03s)
  • 08:18 dcausse@deploy2002: dcausse, gmodena: Continuing with sync
  • 08:11 dcausse@deploy2002: dcausse, gmodena: Backport for cirrus: create buckets for mlr 2025 experiment (T385972), cirrus: deploy new mlr models (T385972) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:08 dcausse@deploy2002: Started scap sync-world: Backport for cirrus: create buckets for mlr 2025 experiment (T385972), cirrus: deploy new mlr models (T385972)
  • 04:55 eileen: civicrm upgraded from 7ceb3ee9 to 454e0ccd
  • 02:02 zabe@deploy2002: Finished scap sync-world: Backport for MCR Stage 4: Reduce dewiktionary revision-slots cache expiry (duration: 11m 46s)
  • 01:55 zabe@deploy2002: zabe: Continuing with sync
  • 01:55 zabe@deploy2002: zabe: Backport for MCR Stage 4: Reduce dewiktionary revision-slots cache expiry synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 01:50 zabe@deploy2002: Started scap sync-world: Backport for MCR Stage 4: Reduce dewiktionary revision-slots cache expiry
  • 00:23 eileen: civicrm upgraded from 00b560e4 to 7ceb3ee9
  • 00:18 eileen: civicrm upgraded from d027bc7b to 00b560e4
  • 00:12 zabe: zabe@mwmaint2002:~$ cat /srv/mediawiki-staging/dblists/group1.dblist | xargs -I{} bash -c "echo {}; mwscript extensions/WikimediaMaintenance/migrateESRefToContentTableStage2.php {} --delete /home/zabe/text_table_cleanup/{} --sleep 0.3" # T183490

2025-02-11

2025-02-10

2025-02-09

  • 13:52 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cr2-magru with reason: IBGP instability from cr1 to cr2 in magru causing ping faulures from alert1002
  • 01:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2240 (T384592)', diff saved to https://phabricator.wikimedia.org/P73430 and previous config saved to /var/cache/conftool/dbconfig/20250209-013642-marostegui.json
  • 01:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P73429 and previous config saved to /var/cache/conftool/dbconfig/20250209-012135-marostegui.json
  • 01:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P73428 and previous config saved to /var/cache/conftool/dbconfig/20250209-010628-marostegui.json
  • 00:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2240 (T384592)', diff saved to https://phabricator.wikimedia.org/P73427 and previous config saved to /var/cache/conftool/dbconfig/20250209-005121-marostegui.json

2025-02-08

  • 19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2240 (T384592)', diff saved to https://phabricator.wikimedia.org/P73426 and previous config saved to /var/cache/conftool/dbconfig/20250208-193620-marostegui.json
  • 19:36 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2240.codfw.wmnet with reason: Maintenance
  • 15:32 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
  • 15:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T384592)', diff saved to https://phabricator.wikimedia.org/P73425 and previous config saved to /var/cache/conftool/dbconfig/20250208-153144-marostegui.json
  • 15:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P73424 and previous config saved to /var/cache/conftool/dbconfig/20250208-151636-marostegui.json
  • 15:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P73423 and previous config saved to /var/cache/conftool/dbconfig/20250208-150130-marostegui.json
  • 14:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T384592)', diff saved to https://phabricator.wikimedia.org/P73422 and previous config saved to /var/cache/conftool/dbconfig/20250208-144623-marostegui.json
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2237 (T384592)', diff saved to https://phabricator.wikimedia.org/P73421 and previous config saved to /var/cache/conftool/dbconfig/20250208-091745-marostegui.json
  • 09:17 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2237.codfw.wmnet with reason: Maintenance
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T384592)', diff saved to https://phabricator.wikimedia.org/P73420 and previous config saved to /var/cache/conftool/dbconfig/20250208-091721-marostegui.json
  • 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P73419 and previous config saved to /var/cache/conftool/dbconfig/20250208-090214-marostegui.json
  • 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P73418 and previous config saved to /var/cache/conftool/dbconfig/20250208-084707-marostegui.json
  • 08:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T384592)', diff saved to https://phabricator.wikimedia.org/P73417 and previous config saved to /var/cache/conftool/dbconfig/20250208-083201-marostegui.json
  • 03:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2236 (T384592)', diff saved to https://phabricator.wikimedia.org/P73416 and previous config saved to /var/cache/conftool/dbconfig/20250208-034038-marostegui.json
  • 03:40 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2236.codfw.wmnet with reason: Maintenance
  • 03:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T384592)', diff saved to https://phabricator.wikimedia.org/P73415 and previous config saved to /var/cache/conftool/dbconfig/20250208-034015-marostegui.json
  • 03:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P73414 and previous config saved to /var/cache/conftool/dbconfig/20250208-032508-marostegui.json
  • 03:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P73413 and previous config saved to /var/cache/conftool/dbconfig/20250208-031000-marostegui.json
  • 02:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T384592)', diff saved to https://phabricator.wikimedia.org/P73412 and previous config saved to /var/cache/conftool/dbconfig/20250208-025453-marostegui.json
  • 00:31 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs1025.eqiad.wmnet
  • 00:30 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs1026.eqiad.wmnet

2025-02-07

  • 23:11 bking@cumin2002: START - Cookbook sre.wdqs.categories-reload reloading categories to wdqs1025.eqiad.wmnet
  • 23:11 bking@cumin2002: START - Cookbook sre.wdqs.categories-reload reloading categories to wdqs1026.eqiad.wmnet
  • 23:06 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs2021.codfw.wmnet
  • 23:01 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs1022.eqiad.wmnet
  • 23:00 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs1021.eqiad.wmnet
  • 22:58 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs2020.codfw.wmnet
  • 22:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2219 (T384592)', diff saved to https://phabricator.wikimedia.org/P73410 and previous config saved to /var/cache/conftool/dbconfig/20250207-220433-marostegui.json
  • 22:04 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2219.codfw.wmnet with reason: Maintenance
  • 22:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T384592)', diff saved to https://phabricator.wikimedia.org/P73409 and previous config saved to /var/cache/conftool/dbconfig/20250207-220411-marostegui.json
  • 21:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P73408 and previous config saved to /var/cache/conftool/dbconfig/20250207-214904-marostegui.json
  • 21:40 bking@cumin2002: START - Cookbook sre.wdqs.categories-reload reloading categories to wdqs1022.eqiad.wmnet
  • 21:39 bking@cumin2002: START - Cookbook sre.wdqs.categories-reload reloading categories to wdqs1021.eqiad.wmnet
  • 21:38 bking@cumin2002: START - Cookbook sre.wdqs.categories-reload reloading categories to wdqs2021.codfw.wmnet
  • 21:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P73407 and previous config saved to /var/cache/conftool/dbconfig/20250207-213357-marostegui.json
  • 21:33 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs2019.codfw.wmnet
  • 21:28 bking@cumin2002: START - Cookbook sre.wdqs.categories-reload reloading categories to wdqs2020.codfw.wmnet
  • 21:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T384592)', diff saved to https://phabricator.wikimedia.org/P73406 and previous config saved to /var/cache/conftool/dbconfig/20250207-211851-marostegui.json
  • 21:16 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs2018.codfw.wmnet
  • 20:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73405 and previous config saved to /var/cache/conftool/dbconfig/20250207-203816-root.json
  • 20:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73404 and previous config saved to /var/cache/conftool/dbconfig/20250207-202311-root.json
  • 20:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73403 and previous config saved to /var/cache/conftool/dbconfig/20250207-200805-root.json
  • 20:06 bking@cumin2002: START - Cookbook sre.wdqs.categories-reload reloading categories to wdqs2019.codfw.wmnet
  • 19:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73402 and previous config saved to /var/cache/conftool/dbconfig/20250207-195300-root.json
  • 19:48 bking@cumin2002: START - Cookbook sre.wdqs.categories-reload reloading categories to wdqs2018.codfw.wmnet
  • 19:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73401 and previous config saved to /var/cache/conftool/dbconfig/20250207-193754-root.json
  • 18:33 vriley@cumin1002: START - Cookbook sre.hosts.provision for host db1256.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:32 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1054.eqiad.wmnet with OS bookworm
  • 18:14 vriley@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1256
  • 18:12 vriley@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host db1256
  • 18:07 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:07 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt db1256 - vriley@cumin1002"
  • 18:07 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt db1256 - vriley@cumin1002"
  • 18:04 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1118003 (duration: 12m 54s)
  • 18:03 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 18:01 vriley@cumin1002: START - Cookbook sre.hosts.provision for host db1255.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:59 vriley@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1255
  • 17:59 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1053.eqiad.wmnet with OS bookworm
  • 17:58 rzl@deploy2002: rzl: Continuing with sync
  • 17:58 vriley@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host db1255
  • 17:58 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:58 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt db1255 - vriley@cumin1002"
  • 17:58 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt db1255 - vriley@cumin1002"
  • 17:57 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1118003 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:55 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1118003
  • 17:52 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 17:38 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti1054.eqiad.wmnet with OS bookworm
  • 17:37 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1054.eqiad.wmnet with OS bookworm
  • 17:36 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:36 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add mgmt dns names for test nokia switches - cmooney@cumin1002"
  • 17:35 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add mgmt dns names for test nokia switches - cmooney@cumin1002"
  • 16:46 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 16:34 cdanis@cumin1002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "[not really into teleological thinking] - cdanis@cumin1002"
  • 16:34 cdanis@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: [not really into teleological thinking] - cdanis@cumin1002
  • 16:34 cdanis@cumin1002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: [not really into teleological thinking] - cdanis@cumin1002
  • 16:34 cdanis@cumin1002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "[not really into teleological thinking] - cdanis@cumin1002"
  • 16:33 cdanis@cumin1002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "[not really into teleological thinking] - cdanis@cumin1002"
  • 16:33 cdanis@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: [not really into teleological thinking] - cdanis@cumin1002
  • 16:33 cdanis@cumin1002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: [not really into teleological thinking] - cdanis@cumin1002
  • 16:33 cdanis@cumin1002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "[not really into teleological thinking] - cdanis@cumin1002"
  • 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2210 (T384592)', diff saved to https://phabricator.wikimedia.org/P73400 and previous config saved to /var/cache/conftool/dbconfig/20250207-161646-marostegui.json
  • 16:16 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2210.codfw.wmnet with reason: Maintenance
  • 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T384592)', diff saved to https://phabricator.wikimedia.org/P73399 and previous config saved to /var/cache/conftool/dbconfig/20250207-161624-marostegui.json
  • 16:16 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti1054.eqiad.wmnet with OS bookworm
  • 16:15 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1054.eqiad.wmnet with OS bookworm
  • 16:13 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti1053.eqiad.wmnet with OS bookworm
  • 16:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P73398 and previous config saved to /var/cache/conftool/dbconfig/20250207-160117-marostegui.json
  • 15:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P73397 and previous config saved to /var/cache/conftool/dbconfig/20250207-154610-marostegui.json
  • 15:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T384592)', diff saved to https://phabricator.wikimedia.org/P73396 and previous config saved to /var/cache/conftool/dbconfig/20250207-153103-marostegui.json
  • 15:04 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1020.eqiad.wmnet with reason: maintenance
  • 15:04 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1016.eqiad.wmnet with reason: maintenance
  • 15:03 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Index rebuild
  • 15:02 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1170.eqiad.wmnet
  • 14:56 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1170.eqiad.wmnet
  • 14:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1170', diff saved to https://phabricator.wikimedia.org/P73395 and previous config saved to /var/cache/conftool/dbconfig/20250207-145547-marostegui.json
  • 14:50 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet,service=s5
  • 14:50 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet,service=s8
  • 14:36 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=s8
  • 14:36 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=s5
  • 14:36 fnegri@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1020.eqiad.wmnet with reason: Rebooting clouddb1020 T384946
  • 14:35 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet,service=s6
  • 14:35 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet,service=s4
  • 14:22 fnegri@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1019.eqiad.wmnet with reason: Rebooting clouddb1019 T384946
  • 14:21 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet,service=s6
  • 14:21 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet,service=s4
  • 14:20 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet
  • 14:20 fnegri@cumin1002: START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet
  • 14:20 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet,service=s4
  • 14:20 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet,service=s6
  • 14:11 fnegri@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 T384946
  • 14:10 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddb1015.eqiad.wmnet
  • 14:03 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1018.eqiad.wmnet with reason: maintenance
  • 14:03 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddb1015.eqiad.wmnet
  • 14:02 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1014.eqiad.wmnet with reason: maintenance
  • 14:02 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet,service=s6
  • 14:02 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet,service=s4
  • 12:26 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73394 and previous config saved to /var/cache/conftool/dbconfig/20250207-122645-root.json
  • 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73393 and previous config saved to /var/cache/conftool/dbconfig/20250207-121140-root.json
  • 12:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
  • 12:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
  • 11:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73392 and previous config saved to /var/cache/conftool/dbconfig/20250207-115634-root.json
  • 11:50 jmm@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ganeti1033.eqiad.wmnet
  • 11:42 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s2
  • 11:42 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s7
  • 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73391 and previous config saved to /var/cache/conftool/dbconfig/20250207-114129-root.json
  • 11:40 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti1033.eqiad.wmnet
  • 11:40 jmm@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ganeti1033.eqiad.wmnet
  • 11:35 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti1033.eqiad.wmnet
  • 11:35 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ganeti1033.eqiad.wmnet
  • 11:35 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti1033.eqiad.wmnet
  • 11:35 jmm@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ganeti1033.eqiad.wmnet
  • 11:29 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s2
  • 11:29 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s7
  • 11:28 fnegri@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1018.eqiad.wmnet with reason: Rebooting clouddb1018 T384946
  • 11:26 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73390 and previous config saved to /var/cache/conftool/dbconfig/20250207-112624-root.json
  • 11:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73389 and previous config saved to /var/cache/conftool/dbconfig/20250207-111619-root.json
  • 11:14 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti1033.eqiad.wmnet
  • 11:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73388 and previous config saved to /var/cache/conftool/dbconfig/20250207-110114-root.json
  • 10:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73387 and previous config saved to /var/cache/conftool/dbconfig/20250207-104818-root.json
  • 10:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73386 and previous config saved to /var/cache/conftool/dbconfig/20250207-104609-root.json
  • 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73385 and previous config saved to /var/cache/conftool/dbconfig/20250207-103710-root.json
  • 10:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73384 and previous config saved to /var/cache/conftool/dbconfig/20250207-103312-root.json
  • 10:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73383 and previous config saved to /var/cache/conftool/dbconfig/20250207-103104-root.json
  • 10:30 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1014.eqiad.wmnet
  • 10:30 fnegri@cumin1002: START - Cookbook sre.hosts.remove-downtime for clouddb1014.eqiad.wmnet
  • 10:24 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet,service=s7
  • 10:24 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet,service=s2
  • 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73382 and previous config saved to /var/cache/conftool/dbconfig/20250207-102205-root.json
  • 10:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73381 and previous config saved to /var/cache/conftool/dbconfig/20250207-101807-root.json
  • 10:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73380 and previous config saved to /var/cache/conftool/dbconfig/20250207-101559-root.json
  • 10:08 fnegri@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1014.eqiad.wmnet with reason: Rebooting clouddb1014 T384946
  • 10:07 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1014.eqiad.wmnet,service=s2
  • 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73379 and previous config saved to /var/cache/conftool/dbconfig/20250207-100700-root.json
  • 10:07 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1014.eqiad.wmnet,service=s7
  • 10:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73378 and previous config saved to /var/cache/conftool/dbconfig/20250207-100302-root.json
  • 09:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73377 and previous config saved to /var/cache/conftool/dbconfig/20250207-095154-root.json
  • 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73376 and previous config saved to /var/cache/conftool/dbconfig/20250207-094756-root.json
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1234 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73375 and previous config saved to /var/cache/conftool/dbconfig/20250207-093649-root.json
  • 09:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2206 (T384592)', diff saved to https://phabricator.wikimedia.org/P73374 and previous config saved to /var/cache/conftool/dbconfig/20250207-091459-marostegui.json
  • 09:14 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2206.codfw.wmnet with reason: Maintenance
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73373 and previous config saved to /var/cache/conftool/dbconfig/20250207-080638-root.json
  • 08:02 marostegui@cumin1002: dbctl commit (dc=all): 'es1027 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73372 and previous config saved to /var/cache/conftool/dbconfig/20250207-080218-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73371 and previous config saved to /var/cache/conftool/dbconfig/20250207-075132-root.json
  • 07:47 marostegui@cumin1002: dbctl commit (dc=all): 'es1027 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73370 and previous config saved to /var/cache/conftool/dbconfig/20250207-074712-root.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73369 and previous config saved to /var/cache/conftool/dbconfig/20250207-073627-root.json
  • 07:32 marostegui@cumin1002: dbctl commit (dc=all): 'es1027 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73368 and previous config saved to /var/cache/conftool/dbconfig/20250207-073207-root.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73367 and previous config saved to /var/cache/conftool/dbconfig/20250207-072122-root.json
  • 07:17 marostegui@cumin1002: dbctl commit (dc=all): 'es1027 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73366 and previous config saved to /var/cache/conftool/dbconfig/20250207-071702-root.json
  • 07:13 kevinbazira@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 07:12 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 07:08 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73365 and previous config saved to /var/cache/conftool/dbconfig/20250207-070617-root.json
  • 07:06 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for es1030.eqiad.wmnet
  • 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'es1027 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73364 and previous config saved to /var/cache/conftool/dbconfig/20250207-070156-root.json
  • 07:01 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for es1027.eqiad.wmnet
  • 06:57 root@cumin1002: START - Cookbook sre.mysql.upgrade for es1030.eqiad.wmnet
  • 06:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1030', diff saved to https://phabricator.wikimedia.org/P73363 and previous config saved to /var/cache/conftool/dbconfig/20250207-065730-marostegui.json
  • 06:57 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1026 to es2 master', diff saved to https://phabricator.wikimedia.org/P73362 and previous config saved to /var/cache/conftool/dbconfig/20250207-065700-root.json
  • 06:56 root@cumin1002: START - Cookbook sre.mysql.upgrade for es1027.eqiad.wmnet
  • 06:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1027', diff saved to https://phabricator.wikimedia.org/P73361 and previous config saved to /var/cache/conftool/dbconfig/20250207-065600-marostegui.json
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1029 to es1 master', diff saved to https://phabricator.wikimedia.org/P73360 and previous config saved to /var/cache/conftool/dbconfig/20250207-065546-root.json
  • 06:36 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1234.eqiad.wmnet with reason: Index rebuild
  • 06:36 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Index rebuild
  • 06:36 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2145.codfw.wmnet
  • 06:35 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1234.eqiad.wmnet
  • 06:35 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1174.eqiad.wmnet with reason: Index rebuild
  • 06:35 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2150.codfw.wmnet with reason: Index rebuild
  • 06:34 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2150.codfw.wmnet
  • 06:34 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1174.eqiad.wmnet
  • 06:29 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1234.eqiad.wmnet
  • 06:29 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2145.codfw.wmnet
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1234 db2145', diff saved to https://phabricator.wikimedia.org/P73359 and previous config saved to /var/cache/conftool/dbconfig/20250207-062857-marostegui.json
  • 06:28 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1174.eqiad.wmnet
  • 06:28 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2150.codfw.wmnet
  • 06:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1174 db2150', diff saved to https://phabricator.wikimedia.org/P73358 and previous config saved to /var/cache/conftool/dbconfig/20250207-062745-marostegui.json
  • 03:42 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2199.codfw.wmnet with reason: Maintenance
  • 03:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T384592)', diff saved to https://phabricator.wikimedia.org/P73357 and previous config saved to /var/cache/conftool/dbconfig/20250207-034149-marostegui.json
  • 03:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P73356 and previous config saved to /var/cache/conftool/dbconfig/20250207-032642-marostegui.json
  • 03:14 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1053.eqiad.wmnet with OS bookworm
  • 03:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P73355 and previous config saved to /var/cache/conftool/dbconfig/20250207-031134-marostegui.json
  • 02:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T384592)', diff saved to https://phabricator.wikimedia.org/P73354 and previous config saved to /var/cache/conftool/dbconfig/20250207-025628-marostegui.json
  • 02:00 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti1054.eqiad.wmnet with OS bookworm
  • 01:57 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirt1041.eqiad.wmnet
  • 01:54 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host ganeti1053.eqiad.wmnet with OS bookworm
  • 01:49 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:49 andrew@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudvirt1041.eqiad.wmnet
  • 01:48 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:42 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 01:41 vriley@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART

2025-02-06

  • 23:48 cstone: payments-wiki upgraded from d266fdf9 to 793998c0
  • 23:07 swfrench-wmf: ran cumin 'A:cp-text' 'run-puppet-agent -e "merging ATS Lua config change - T383845"' at 21:58:47 (retroactive)
  • 21:48 swfrench-wmf: ran cumin 'A:cp-text' 'disable-puppet "merging ATS Lua config change - T383845"'
  • 21:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T384592)', diff saved to https://phabricator.wikimedia.org/P73352 and previous config saved to /var/cache/conftool/dbconfig/20250206-212719-marostegui.json
  • 21:27 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 21:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T384592)', diff saved to https://phabricator.wikimedia.org/P73351 and previous config saved to /var/cache/conftool/dbconfig/20250206-212656-marostegui.json
  • 21:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P73350 and previous config saved to /var/cache/conftool/dbconfig/20250206-211149-marostegui.json
  • 20:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P73349 and previous config saved to /var/cache/conftool/dbconfig/20250206-205642-marostegui.json
  • 20:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73348 and previous config saved to /var/cache/conftool/dbconfig/20250206-205437-root.json
  • 20:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T384592)', diff saved to https://phabricator.wikimedia.org/P73347 and previous config saved to /var/cache/conftool/dbconfig/20250206-204135-marostegui.json
  • 20:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73346 and previous config saved to /var/cache/conftool/dbconfig/20250206-203932-root.json
  • 20:27 pt1979@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1054.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73345 and previous config saved to /var/cache/conftool/dbconfig/20250206-202426-root.json
  • 20:21 pt1979@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1054.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73344 and previous config saved to /var/cache/conftool/dbconfig/20250206-200921-root.json
  • 19:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73343 and previous config saved to /var/cache/conftool/dbconfig/20250206-195417-root.json
  • 19:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2213', diff saved to https://phabricator.wikimedia.org/P73342 and previous config saved to /var/cache/conftool/dbconfig/20250206-195250-marostegui.json
  • 19:52 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2213.codfw.wmnet with reason: maintenance
  • 19:32 sukhe: sudo cumin 'A:cumin' 'run-puppet-agent'
  • 19:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
  • 18:44 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73339 and previous config saved to /var/cache/conftool/dbconfig/20250206-184451-root.json
  • 18:42 cdanis@deploy2002: Finished scap sync-world: Backport for Route PHP8 Excimer profiles to separate ArcLamp sinks (T383845 T385395 T385199) (duration: 10m 58s)
  • 18:34 cdanis@deploy2002: cdanis: Continuing with sync
  • 18:33 cdanis@deploy2002: cdanis: Backport for Route PHP8 Excimer profiles to separate ArcLamp sinks (T383845 T385395 T385199) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:31 cdanis@deploy2002: Started scap sync-world: Backport for Route PHP8 Excimer profiles to separate ArcLamp sinks (T383845 T385395 T385199)
  • 18:29 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73338 and previous config saved to /var/cache/conftool/dbconfig/20250206-182946-root.json
  • 18:28 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2008-dev.codfw.wmnet
  • 18:22 andrew@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudnet2008-dev.codfw.wmnet
  • 18:21 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
  • 18:15 andrew@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
  • 18:15 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
  • 18:14 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73337 and previous config saved to /var/cache/conftool/dbconfig/20250206-181441-root.json
  • 18:08 andrew@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
  • 18:08 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
  • 18:01 andrew@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
  • 17:59 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73336 and previous config saved to /var/cache/conftool/dbconfig/20250206-175936-root.json
  • 17:55 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
  • 17:48 andrew@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
  • 17:48 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
  • 17:44 marostegui@cumin1002: dbctl commit (dc=all): 'db2159 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73335 and previous config saved to /var/cache/conftool/dbconfig/20250206-174431-root.json
  • 17:41 andrew@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
  • 17:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73334 and previous config saved to /var/cache/conftool/dbconfig/20250206-171835-root.json
  • 17:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73333 and previous config saved to /var/cache/conftool/dbconfig/20250206-171626-root.json
  • 17:16 marostegui@cumin1002: dbctl commit (dc=all): 'db1191 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73332 and previous config saved to /var/cache/conftool/dbconfig/20250206-171601-root.json
  • 17:15 swfrench-wmf: mw-api-int mw-jobrunner mw-parsoid reverted to 100% PHP 7.4 as of 17:03 - T383845
  • 17:03 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73331 and previous config saved to /var/cache/conftool/dbconfig/20250206-170330-root.json
  • 17:03 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:03 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:03 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:02 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 17:02 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 17:02 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 17:01 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 17:01 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 17:01 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:01 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 17:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73330 and previous config saved to /var/cache/conftool/dbconfig/20250206-170121-root.json
  • 17:01 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 17:00 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 17:00 marostegui@cumin1002: dbctl commit (dc=all): 'db1191 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73329 and previous config saved to /var/cache/conftool/dbconfig/20250206-170055-root.json
  • 16:59 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 16:58 swfrench@deploy2002: Finished scap sync-world: Backport for Disable cookie-based enrollment in 8.1 (T383845) (duration: 10m 03s)
  • 16:52 swfrench@deploy2002: swfrench: Continuing with sync
  • 16:51 swfrench@deploy2002: swfrench: Backport for Disable cookie-based enrollment in 8.1 (T383845) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:48 swfrench@deploy2002: Started scap sync-world: Backport for Disable cookie-based enrollment in 8.1 (T383845)
  • 16:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73328 and previous config saved to /var/cache/conftool/dbconfig/20250206-164825-root.json
  • 16:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73327 and previous config saved to /var/cache/conftool/dbconfig/20250206-164615-root.json
  • 16:45 marostegui@cumin1002: dbctl commit (dc=all): 'db1191 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73326 and previous config saved to /var/cache/conftool/dbconfig/20250206-164550-root.json
  • 16:42 aikochou@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 16:34 aikochou@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 16:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73325 and previous config saved to /var/cache/conftool/dbconfig/20250206-163320-root.json
  • 16:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73324 and previous config saved to /var/cache/conftool/dbconfig/20250206-163109-root.json
  • 16:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1191 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73323 and previous config saved to /var/cache/conftool/dbconfig/20250206-163044-root.json
  • 16:18 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 16:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73322 and previous config saved to /var/cache/conftool/dbconfig/20250206-161814-root.json
  • 16:18 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 16:17 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 16:17 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73321 and previous config saved to /var/cache/conftool/dbconfig/20250206-161604-root.json
  • 16:15 marostegui@cumin1002: dbctl commit (dc=all): 'db1191 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73320 and previous config saved to /var/cache/conftool/dbconfig/20250206-161539-root.json
  • 16:12 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 16:12 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 16:12 pt1979@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1054.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:11 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-be[2051-2056].codfw.wmnet
  • 16:11 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:11 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-be[2051-2056].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
  • 16:09 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-be[2051-2056].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
  • 16:05 pt1979@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1054.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:01 mvernon@cumin2002: START - Cookbook sre.dns.netbox
  • 15:40 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-be[2051-2056].codfw.wmnet
  • 15:38 vriley@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host fransc1001
  • 15:38 vriley@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host fransc1001
  • 15:36 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:36 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt fransc1001 - vriley@cumin1002"
  • 15:36 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt fransc1001 - vriley@cumin1002"
  • 15:34 godog: systemctl restart thanos-query on titan1*
  • 15:32 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 15:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T384592)', diff saved to https://phabricator.wikimedia.org/P73319 and previous config saved to /var/cache/conftool/dbconfig/20250206-150702-marostegui.json
  • 15:06 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 15:06 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 15:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T384592)', diff saved to https://phabricator.wikimedia.org/P73318 and previous config saved to /var/cache/conftool/dbconfig/20250206-150624-marostegui.json
  • 15:01 Lucas_WMDE: elukey@deploy2002 helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . [re-log due to stashbot issue, originally logged 14:58 UTC]
  • 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P73317 and previous config saved to /var/cache/conftool/dbconfig/20250206-145117-marostegui.json
  • 14:51 urbanecm@deploy2002: Finished scap sync-world: Backport for temp accounts: Enable IP reveal rights for local groups on meta (T356294) (duration: 13m 28s)
  • 14:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1003.eqiad.wmnet to plain
  • 14:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1003.eqiad.wmnet to plain
  • 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
  • 14:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
  • 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1003.eqiad.wmnet to drbd
  • 14:44 urbanecm@deploy2002: tchanders, urbanecm: Continuing with sync
  • 14:40 urbanecm@deploy2002: tchanders, urbanecm: Backport for temp accounts: Enable IP reveal rights for local groups on meta (T356294) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:37 urbanecm@deploy2002: Started scap sync-world: Backport for temp accounts: Enable IP reveal rights for local groups on meta (T356294)
  • 14:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P73316 and previous config saved to /var/cache/conftool/dbconfig/20250206-143609-marostegui.json
  • 14:30 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1003.eqiad.wmnet to drbd
  • 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
  • 14:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
  • 14:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
  • 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
  • 14:22 urbanecm@deploy2002: Finished scap sync-world: Backport for Disable new WebAuthn credentials creation (T378402 T354701) (duration: 14m 00s)
  • 14:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T384592)', diff saved to https://phabricator.wikimedia.org/P73314 and previous config saved to /var/cache/conftool/dbconfig/20250206-142102-marostegui.json
  • 14:16 urbanecm@deploy2002: pmiazga, urbanecm: Continuing with sync
  • 14:11 urbanecm@deploy2002: pmiazga, urbanecm: Backport for Disable new WebAuthn credentials creation (T378402 T354701) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:08 urbanecm@deploy2002: Started scap sync-world: Backport for Disable new WebAuthn credentials creation (T378402 T354701)
  • 14:04 urbanecm@deploy2002: Finished scap sync-world: Backport for Babel: Remove config that is now in community configuration (T385239), Babel: Do not use a wmg variable for BabelDefaultLevel (T119117), Babel: Merge back into InitialiseSettings.php (T385239) (duration: 10m 36s)
  • 13:58 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1038.eqiad.wmnet to cluster eqiad and group D
  • 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1038.eqiad.wmnet to cluster eqiad and group D
  • 13:57 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 13:56 urbanecm@deploy2002: urbanecm: Backport for Babel: Remove config that is now in community configuration (T385239), Babel: Do not use a wmg variable for BabelDefaultLevel (T119117), Babel: Merge back into InitialiseSettings.php (T385239) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
  • 13:53 urbanecm@deploy2002: Started scap sync-world: Backport for Babel: Remove config that is now in community configuration (T385239), Babel: Do not use a wmg variable for BabelDefaultLevel (T119117), Babel: Merge back into InitialiseSettings.php (T385239)
  • 13:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
  • 13:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1038.eqiad.wmnet with OS bookworm
  • 13:34 cgoubert@deploy2002: Finished scap sync-world: no-op deploy to clean up diff (duration: 02m 59s)
  • 13:32 cgoubert@deploy2002: Started scap sync-world: no-op deploy to clean up diff
  • 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1038.eqiad.wmnet with reason: host reimage
  • 13:21 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: sync
  • 13:21 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: sync
  • 13:19 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1235.eqiad.wmnet with reason: Index rebuild
  • 13:18 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1235.eqiad.wmnet
  • 13:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1038.eqiad.wmnet with reason: host reimage
  • 13:18 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2188.codfw.wmnet with reason: Index rebuild
  • 13:18 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2188.codfw.wmnet
  • 13:13 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2188.codfw.wmnet
  • 13:13 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1235.eqiad.wmnet
  • 13:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1235 db2188 T385561', diff saved to https://phabricator.wikimedia.org/P73313 and previous config saved to /var/cache/conftool/dbconfig/20250206-131300-marostegui.json
  • 13:04 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2159.codfw.wmnet with reason: Index rebuild
  • 13:04 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1191.eqiad.wmnet with reason: Index rebuild
  • 13:04 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2159.codfw.wmnet
  • 13:04 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1191.eqiad.wmnet
  • 12:59 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1038.eqiad.wmnet with OS bookworm
  • 12:58 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 12:57 kamila@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 12:57 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1191.eqiad.wmnet
  • 12:57 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2159.codfw.wmnet
  • 12:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2159 db1191 T385550', diff saved to https://phabricator.wikimedia.org/P73312 and previous config saved to /var/cache/conftool/dbconfig/20250206-125713-marostegui.json
  • 12:56 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet,service=s5
  • 12:55 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet,service=s8
  • 12:45 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 12:45 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
  • 12:45 kamila@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 12:44 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
  • 12:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ganeti1038.eqiad.wmnet with reason: remove from cluster for reimage
  • 12:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
  • 12:40 kamila@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:40 kamila@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 12:40 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 12:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 12:30 ladsgroup@deploy2002: Finished scap sync-world: Backport for Set categorylinks to write both everywhere except commonswiki (T385164) (duration: 11m 50s)
  • 12:27 moritzm: installing openjpeg2 security updates
  • 12:23 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 12:22 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 12:21 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 12:21 ladsgroup@deploy2002: ladsgroup: Backport for Set categorylinks to write both everywhere except commonswiki (T385164) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:18 ladsgroup@deploy2002: Started scap sync-world: Backport for Set categorylinks to write both everywhere except commonswiki (T385164)
  • 12:11 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 12:11 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 12:06 moritzm: installing bind9 security updates (client-side libs/tools only)
  • 11:58 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 11:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73311 and previous config saved to /var/cache/conftool/dbconfig/20250206-115556-root.json
  • 11:53 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:53 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 11:53 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:52 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 11:51 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddb1016.eqiad.wmnet
  • 11:51 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 11:51 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 11:50 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 11:50 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 11:49 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 11:49 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 11:49 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 11:48 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddb1016.eqiad.wmnet
  • 11:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73310 and previous config saved to /var/cache/conftool/dbconfig/20250206-114051-root.json
  • 11:40 moritzm: installing iperf3 security updates
  • 11:34 fnegri@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1016.eqiad.wmnet with reason: Rebooting clouddb1016 T384946
  • 11:32 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=s8
  • 11:32 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=s5
  • 11:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73309 and previous config saved to /var/cache/conftool/dbconfig/20250206-112546-root.json
  • 11:16 marostegui@cumin1002: dbctl commit (dc=all): 'db1194 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73308 and previous config saved to /var/cache/conftool/dbconfig/20250206-111559-root.json
  • 11:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73307 and previous config saved to /var/cache/conftool/dbconfig/20250206-111041-root.json
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'db1194 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73306 and previous config saved to /var/cache/conftool/dbconfig/20250206-110054-root.json
  • 10:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2208 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73303 and previous config saved to /var/cache/conftool/dbconfig/20250206-105536-root.json
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'db1194 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73301 and previous config saved to /var/cache/conftool/dbconfig/20250206-104549-root.json
  • 10:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1194 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73300 and previous config saved to /var/cache/conftool/dbconfig/20250206-103044-root.json
  • 10:15 marostegui@cumin1002: dbctl commit (dc=all): 'db1194 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73299 and previous config saved to /var/cache/conftool/dbconfig/20250206-101538-root.json
  • 09:55 marostegui@cumin1002: dbctl commit (dc=all): 'db2236 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73298 and previous config saved to /var/cache/conftool/dbconfig/20250206-095515-root.json
  • 09:52 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'es1040 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73297 and previous config saved to /var/cache/conftool/dbconfig/20250206-094724-root.json
  • 09:40 marostegui@cumin1002: dbctl commit (dc=all): 'db2236 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73296 and previous config saved to /var/cache/conftool/dbconfig/20250206-094009-root.json
  • 09:33 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.44.0-wmf.15 refs T382366
  • 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'es1040 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73295 and previous config saved to /var/cache/conftool/dbconfig/20250206-093218-root.json
  • 09:25 marostegui@cumin1002: dbctl commit (dc=all): 'db2236 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73294 and previous config saved to /var/cache/conftool/dbconfig/20250206-092504-root.json
  • 09:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73293 and previous config saved to /var/cache/conftool/dbconfig/20250206-092139-root.json
  • 09:19 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-worker1002.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
  • 09:19 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-worker2001.codfw.wmnet,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'es1040 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73291 and previous config saved to /var/cache/conftool/dbconfig/20250206-091713-root.json
  • 09:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2236 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73290 and previous config saved to /var/cache/conftool/dbconfig/20250206-090959-root.json
  • 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73289 and previous config saved to /var/cache/conftool/dbconfig/20250206-090634-root.json
  • 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'es1040 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73288 and previous config saved to /var/cache/conftool/dbconfig/20250206-090208-root.json
  • 08:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2236 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73287 and previous config saved to /var/cache/conftool/dbconfig/20250206-085454-root.json
  • 08:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73286 and previous config saved to /var/cache/conftool/dbconfig/20250206-085129-root.json
  • 08:51 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2236.codfw.wmnet
  • 08:47 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: maintenance
  • 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'es1040 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73285 and previous config saved to /var/cache/conftool/dbconfig/20250206-084703-root.json
  • 08:44 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1040.eqiad.wmnet with reason: maintenance
  • 08:43 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for es1040.eqiad.wmnet
  • 08:38 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2236.codfw.wmnet
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2236', diff saved to https://phabricator.wikimedia.org/P73284 and previous config saved to /var/cache/conftool/dbconfig/20250206-083758-marostegui.json
  • 08:37 root@cumin1002: START - Cookbook sre.mysql.upgrade for es1040.eqiad.wmnet
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1040', diff saved to https://phabricator.wikimedia.org/P73283 and previous config saved to /var/cache/conftool/dbconfig/20250206-083654-marostegui.json
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73282 and previous config saved to /var/cache/conftool/dbconfig/20250206-083623-root.json
  • 08:36 kartik@deploy2002: Finished scap sync-world: Backport for Enable section translation on Kanuri Wikipedia (T385185) (duration: 12m 25s)
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T384592)', diff saved to https://phabricator.wikimedia.org/P73281 and previous config saved to /var/cache/conftool/dbconfig/20250206-083145-marostegui.json
  • 08:31 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 08:30 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=1; selector: name=wikikube-worker2001.codfw.wmnet,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl
  • 08:29 kartik@deploy2002: kartik, pppery: Continuing with sync
  • 08:28 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=1; selector: name=wikikube-worker1002.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
  • 08:26 kartik@deploy2002: kartik, pppery: Backport for Enable section translation on Kanuri Wikipedia (T385185) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:24 moritzm: rebalance codfw/B following OS updates T382508
  • 08:23 kartik@deploy2002: Started scap sync-world: Backport for Enable section translation on Kanuri Wikipedia (T385185)
  • 08:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2213 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73280 and previous config saved to /var/cache/conftool/dbconfig/20250206-082117-root.json
  • 08:18 kartik@deploy2002: Finished scap sync-world: Backport for Make MT limit more strict by 10 Percentage Point in Bhojpuri Wikipedia (T383789) (duration: 13m 34s)
  • 08:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
  • 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
  • 08:12 Ammar: T385770 Ran mwscript-k8s extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=dawiki --logwiki=metawiki 'Sprucecopse' 'Renamed user 7cf752558fab818efdcacff8255d91ca'
  • 08:11 kartik@deploy2002: kartik: Continuing with sync
  • 08:09 kartik@deploy2002: kartik: Backport for Make MT limit more strict by 10 Percentage Point in Bhojpuri Wikipedia (T383789) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:05 kartik@deploy2002: Started scap sync-world: Backport for Make MT limit more strict by 10 Percentage Point in Bhojpuri Wikipedia (T383789)
  • 07:28 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Index rebuild
  • 07:28 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2213.codfw.wmnet
  • 07:23 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2213.codfw.wmnet
  • 07:21 marostegui@dns1006: END - running authdns-update
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2213 T385148', diff saved to https://phabricator.wikimedia.org/P73279 and previous config saved to /var/cache/conftool/dbconfig/20250206-072020-marostegui.json
  • 07:19 marostegui@dns1006: START - running authdns-update
  • 07:19 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2192 to s5 primary and set section read-write T385148', diff saved to https://phabricator.wikimedia.org/P73278 and previous config saved to /var/cache/conftool/dbconfig/20250206-071902-root.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'Set s5 codfw as read-only for maintenance - T385148', diff saved to https://phabricator.wikimedia.org/P73277 and previous config saved to /var/cache/conftool/dbconfig/20250206-071836-root.json
  • 07:18 marostegui: Starting s5 codfw failover from db2213 to db2192 - T385148
  • 07:07 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1194.eqiad.wmnet with reason: Index rebuild
  • 07:07 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2208.codfw.wmnet with reason: Index rebuild
  • 07:05 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1194.eqiad.wmnet
  • 07:02 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2208.codfw.wmnet
  • 06:59 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Primary switchover s5 T385148
  • 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2192 with weight 0 T385148', diff saved to https://phabricator.wikimedia.org/P73276 and previous config saved to /var/cache/conftool/dbconfig/20250206-065925-root.json
  • 06:58 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2208.codfw.wmnet
  • 06:58 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1194.eqiad.wmnet
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2208 db1194 T385550', diff saved to https://phabricator.wikimedia.org/P73275 and previous config saved to /var/cache/conftool/dbconfig/20250206-065759-marostegui.json
  • 04:55 ejegg: payments-wiki upgraded from MW 1.39 to MW 1.43 (needs db update)
  • 04:01 ejegg: upgraded payments-wiki-staging from 7eeb643 to 4cdd67b
  • 03:22 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 03:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T384592)', diff saved to https://phabricator.wikimedia.org/P73274 and previous config saved to /var/cache/conftool/dbconfig/20250206-032148-marostegui.json
  • 03:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P73273 and previous config saved to /var/cache/conftool/dbconfig/20250206-030641-marostegui.json
  • 02:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P73272 and previous config saved to /var/cache/conftool/dbconfig/20250206-025134-marostegui.json
  • 02:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T384592)', diff saved to https://phabricator.wikimedia.org/P73271 and previous config saved to /var/cache/conftool/dbconfig/20250206-023626-marostegui.json

2025-02-05

  • 23:50 jdrewniak@deploy2002: Finished scap sync-world: Backport for Speed tests: Add HTML files for touch action (T118509) (duration: 11m 10s)
  • 23:44 jdrewniak@deploy2002: jdlrobson, jdrewniak: Continuing with sync
  • 23:42 jdrewniak@deploy2002: jdlrobson, jdrewniak: Backport for Speed tests: Add HTML files for touch action (T118509) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:39 jdrewniak@deploy2002: Started scap sync-world: Backport for Speed tests: Add HTML files for touch action (T118509)
  • 23:35 cwhite@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash2026.codfw.wmnet
  • 23:35 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:35 cwhite@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash2026.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 23:35 cwhite@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash2026.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 23:31 jdrewniak@deploy2002: Finished scap sync-world: Backport for Deploy dark mode to anonymous users for certain projects (February 2025) (T383451) (duration: 12m 27s)
  • 23:30 cwhite@cumin2002: START - Cookbook sre.dns.netbox
  • 23:25 jdrewniak@deploy2002: jdrewniak, jdlrobson: Continuing with sync
  • 23:22 jdrewniak@deploy2002: jdrewniak, jdlrobson: Backport for Deploy dark mode to anonymous users for certain projects (February 2025) (T383451) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:19 jdrewniak@deploy2002: Started scap sync-world: Backport for Deploy dark mode to anonymous users for certain projects (February 2025) (T383451)
  • 23:19 cwhite@cumin2002: START - Cookbook sre.hosts.decommission for hosts logstash2026.codfw.wmnet
  • 23:07 cwhite@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash2027.codfw.wmnet
  • 23:07 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:07 cwhite@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash2027.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 23:06 cwhite@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash2027.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 23:00 cwhite@cumin2002: START - Cookbook sre.dns.netbox
  • 22:55 cwhite@cumin2002: START - Cookbook sre.hosts.decommission for hosts logstash2027.codfw.wmnet
  • 22:54 cwhite@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash2028.codfw.wmnet
  • 22:54 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:54 cwhite@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash2028.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 22:51 cwhite@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash2028.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 22:45 cwhite@cumin2002: START - Cookbook sre.dns.netbox
  • 22:41 cwhite@cumin2002: START - Cookbook sre.hosts.decommission for hosts logstash2028.codfw.wmnet
  • 22:40 cwhite@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash2029.codfw.wmnet
  • 22:40 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:40 cwhite@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash2029.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 22:40 cwhite@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash2029.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002"
  • 22:36 cwhite@cumin2002: START - Cookbook sre.dns.netbox
  • 22:31 cwhite@cumin2002: START - Cookbook sre.hosts.decommission for hosts logstash2029.codfw.wmnet
  • 21:42 jdrewniak@deploy2002: Finished scap sync-world: Backport for Enable $wgAllowAuthenticatedCrossOrigin on testwiki (T322944) (duration: 13m 50s)
  • 21:35 jdrewniak@deploy2002: lucaswerkmeister, jdrewniak: Continuing with sync
  • 21:31 jdrewniak@deploy2002: lucaswerkmeister, jdrewniak: Backport for Enable $wgAllowAuthenticatedCrossOrigin on testwiki (T322944) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:28 jdrewniak@deploy2002: Started scap sync-world: Backport for Enable $wgAllowAuthenticatedCrossOrigin on testwiki (T322944)
  • 21:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1249 (T384592)', diff saved to https://phabricator.wikimedia.org/P73270 and previous config saved to /var/cache/conftool/dbconfig/20250205-212751-marostegui.json
  • 21:27 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 21:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T384592)', diff saved to https://phabricator.wikimedia.org/P73269 and previous config saved to /var/cache/conftool/dbconfig/20250205-212729-marostegui.json
  • 21:21 cdanis: upgraded python3-conftool-requestctl and friends on puppetservers/puppetmasters
  • 21:14 cdanis: released new conftool 5.0.2 for all distros to apt.wm.o
  • 21:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P73268 and previous config saved to /var/cache/conftool/dbconfig/20250205-211222-marostegui.json
  • 20:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P73267 and previous config saved to /var/cache/conftool/dbconfig/20250205-205715-marostegui.json
  • 20:55 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1054.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 20:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T384592)', diff saved to https://phabricator.wikimedia.org/P73266 and previous config saved to /var/cache/conftool/dbconfig/20250205-204208-marostegui.json
  • 20:10 sukhe: granting brett member,reader role on beta
  • 18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73264 and previous config saved to /var/cache/conftool/dbconfig/20250205-183318-root.json
  • 18:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73263 and previous config saved to /var/cache/conftool/dbconfig/20250205-183306-root.json
  • 18:31 swfrench@deploy2002: Finished scap sync-world: Backport for Enroll 50% of client sessions in PHP 8.1 (T383845) (duration: 12m 57s)
  • 18:24 swfrench@deploy2002: swfrench: Continuing with sync
  • 18:22 swfrench@deploy2002: swfrench: Backport for Enroll 50% of client sessions in PHP 8.1 (T383845) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73262 and previous config saved to /var/cache/conftool/dbconfig/20250205-181813-root.json
  • 18:18 swfrench@deploy2002: Started scap sync-world: Backport for Enroll 50% of client sessions in PHP 8.1 (T383845)
  • 18:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73261 and previous config saved to /var/cache/conftool/dbconfig/20250205-181801-root.json
  • 18:11 swfrench-wmf: mw-api-int to ~ 5% of traffic on PHP 8.1 - T383845
  • 18:11 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 18:10 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 18:10 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 18:10 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 18:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 18:07 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 18:06 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 18:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 18:04 swfrench-wmf: scaled mw-api-ext and mw-web next releases to 25% of main - T383845
  • 18:04 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 18:03 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 18:03 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 18:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73259 and previous config saved to /var/cache/conftool/dbconfig/20250205-180307-root.json
  • 18:03 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 18:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73258 and previous config saved to /var/cache/conftool/dbconfig/20250205-180256-root.json
  • 18:01 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 18:01 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 18:01 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 18:00 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73257 and previous config saved to /var/cache/conftool/dbconfig/20250205-174802-root.json
  • 17:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73256 and previous config saved to /var/cache/conftool/dbconfig/20250205-174750-root.json
  • 17:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1054.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:45 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1054.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1054.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73255 and previous config saved to /var/cache/conftool/dbconfig/20250205-173257-root.json
  • 17:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73254 and previous config saved to /var/cache/conftool/dbconfig/20250205-173245-root.json
  • 17:30 mutante: phab1004 - rm /lib/systemd/system/phabricator_stats_job_mfa_check.* for gerrit:1117489 T299403
  • 16:33 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 16:32 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 16:31 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 16:30 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 16:00 klausman@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
  • 15:59 klausman@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73252 and previous config saved to /var/cache/conftool/dbconfig/20250205-155456-root.json
  • 15:51 swfrench-wmf: finished deploying conftool 5.0.1-1 - T383324
  • 15:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73251 and previous config saved to /var/cache/conftool/dbconfig/20250205-153951-root.json
  • 15:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73250 and previous config saved to /var/cache/conftool/dbconfig/20250205-152445-root.json
  • 15:20 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:19 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:19 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:18 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:15 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:15 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73247 and previous config saved to /var/cache/conftool/dbconfig/20250205-145647-root.json
  • 14:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2221 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73246 and previous config saved to /var/cache/conftool/dbconfig/20250205-145434-root.json
  • 14:53 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:52 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for kywiki: create draft namespace (T385593) (duration: 10m 54s)
  • 14:46 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cr2-magru with reason: IBGP instability from cr1 to cr2 in magru causing ping faulures from alert1002
  • 14:46 lucaswerkmeister-wmde@deploy2002: anzx, lucaswerkmeister-wmde: Continuing with sync
  • 14:45 lucaswerkmeister-wmde@deploy2002: anzx, lucaswerkmeister-wmde: Backport for kywiki: create draft namespace (T385593) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:43 jynus: deploy new grants to analytics_meta T385565
  • 14:43 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1237.eqiad.wmnet onto db1179.eqiad.wmnet
  • 14:41 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for kywiki: create draft namespace (T385593)
  • 14:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73245 and previous config saved to /var/cache/conftool/dbconfig/20250205-144141-root.json
  • 14:40 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Add sourceswiki to $wgImportSources for all Wikisources (T385591) (duration: 29m 00s)
  • 14:39 klausman@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
  • 14:39 klausman@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
  • 14:33 lucaswerkmeister-wmde@deploy2002: jhsoby, lucaswerkmeister-wmde: Continuing with sync
  • 14:26 marostegui@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73244 and previous config saved to /var/cache/conftool/dbconfig/20250205-142636-root.json
  • 14:15 lucaswerkmeister-wmde@deploy2002: jhsoby, lucaswerkmeister-wmde: Backport for Add sourceswiki to $wgImportSources for all Wikisources (T385591) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73243 and previous config saved to /var/cache/conftool/dbconfig/20250205-141131-root.json
  • 14:11 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Add sourceswiki to $wgImportSources for all Wikisources (T385591)
  • 14:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1248 (T384592)', diff saved to https://phabricator.wikimedia.org/P73241 and previous config saved to /var/cache/conftool/dbconfig/20250205-140039-marostegui.json
  • 14:00 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 14:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T384592)', diff saved to https://phabricator.wikimedia.org/P73240 and previous config saved to /var/cache/conftool/dbconfig/20250205-140017-marostegui.json
  • 13:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1156 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73238 and previous config saved to /var/cache/conftool/dbconfig/20250205-135320-root.json
  • 13:49 jynus: deploy removal of old hosts for the m1 dbbackups backup user T383871
  • 13:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P73237 and previous config saved to /var/cache/conftool/dbconfig/20250205-134510-marostegui.json
  • 13:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1156 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73236 and previous config saved to /var/cache/conftool/dbconfig/20250205-133815-root.json
  • 13:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P73235 and previous config saved to /var/cache/conftool/dbconfig/20250205-133003-marostegui.json
  • 13:25 klausman@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
  • 13:24 klausman@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop: apply
  • 13:24 klausman@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: apply
  • 13:24 klausman@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: apply
  • 13:23 fceratto@cumin1002: dbctl commit (dc=all): 'db1251 (re)pooling @ 100%: Pooling in', diff saved to https://phabricator.wikimedia.org/P73234 and previous config saved to /var/cache/conftool/dbconfig/20250205-132319-fceratto.json
  • 13:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1156 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73233 and previous config saved to /var/cache/conftool/dbconfig/20250205-132309-root.json
  • 13:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T384592)', diff saved to https://phabricator.wikimedia.org/P73232 and previous config saved to /var/cache/conftool/dbconfig/20250205-131456-marostegui.json
  • 13:08 fceratto@cumin1002: dbctl commit (dc=all): 'db1251 (re)pooling @ 75%: Pooling in', diff saved to https://phabricator.wikimedia.org/P73231 and previous config saved to /var/cache/conftool/dbconfig/20250205-130813-fceratto.json
  • 13:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1156 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73230 and previous config saved to /var/cache/conftool/dbconfig/20250205-130804-root.json
  • 12:53 fceratto@cumin1002: dbctl commit (dc=all): 'db1251 (re)pooling @ 50%: Pooling in', diff saved to https://phabricator.wikimedia.org/P73228 and previous config saved to /var/cache/conftool/dbconfig/20250205-125308-fceratto.json
  • 12:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1156 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73227 and previous config saved to /var/cache/conftool/dbconfig/20250205-125259-root.json
  • 12:50 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db1237.eqiad.wmnet onto db1179.eqiad.wmnet
  • 12:46 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1237.eqiad.wmnet onto db1179.eqiad.wmnet
  • 12:38 fceratto@cumin1002: dbctl commit (dc=all): 'db1251 (re)pooling @ 35%: Pooling in', diff saved to https://phabricator.wikimedia.org/P73226 and previous config saved to /var/cache/conftool/dbconfig/20250205-123803-fceratto.json
  • 12:22 fceratto@cumin1002: dbctl commit (dc=all): 'db1251 (re)pooling @ 30%: Pooling in', diff saved to https://phabricator.wikimedia.org/P73225 and previous config saved to /var/cache/conftool/dbconfig/20250205-122257-fceratto.json
  • 12:12 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
  • 12:12 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
  • 12:12 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
  • 12:12 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
  • 12:09 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 12:09 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 12:07 fceratto@cumin1002: dbctl commit (dc=all): 'db1251 (re)pooling @ 25%: Pooling in', diff saved to https://phabricator.wikimedia.org/P73224 and previous config saved to /var/cache/conftool/dbconfig/20250205-120752-fceratto.json
  • 12:06 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 12:05 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 12:03 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 12:03 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 12:00 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1
  • 12:00 fnegri@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s3
  • 12:00 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
  • 12:00 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
  • 11:56 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1017.eqiad.wmnet with reason: Rebuild tables
  • 11:52 fceratto@cumin1002: dbctl commit (dc=all): 'db1251 (re)pooling @ 20%: Pooling in', diff saved to https://phabricator.wikimedia.org/P73223 and previous config saved to /var/cache/conftool/dbconfig/20250205-115247-fceratto.json
  • 11:42 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 11:42 jiji@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 11:41 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddb1017.eqiad.wmnet
  • 11:38 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host clouddb1017.eqiad.wmnet
  • 11:37 fceratto@cumin1002: dbctl commit (dc=all): 'db1251 (re)pooling @ 15%: Pooling in', diff saved to https://phabricator.wikimedia.org/P73222 and previous config saved to /var/cache/conftool/dbconfig/20250205-113741-fceratto.json
  • 11:34 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 11:34 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb1014.eqiad.wmnet with reason: Rebuild tables
  • 11:34 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 11:33 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on clouddb1018.eqiad.wmnet with reason: Rebuild tables
  • 11:33 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1014.eqiad.wmnet with reason: Rebuild tables
  • 11:33 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Rebuild tables
  • 11:32 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1155.eqiad.wmnet with reason: Rebuild tables
  • 11:31 fnegri@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017 T384946
  • 11:31 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
  • 11:31 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=31
  • 11:31 fnegri@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s1
  • 11:28 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 11:28 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 11:27 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 11:27 jiji@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 11:26 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 11:25 jiji@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 11:22 fceratto@cumin1002: dbctl commit (dc=all): 'db1251 (re)pooling @ 10%: Pooling in', diff saved to https://phabricator.wikimedia.org/P73221 and previous config saved to /var/cache/conftool/dbconfig/20250205-112236-fceratto.json
  • 11:11 godog: bounce thanos-query on titan1002
  • 11:07 fceratto@cumin1002: dbctl commit (dc=all): 'db1251 (re)pooling @ 7%: Pooling in', diff saved to https://phabricator.wikimedia.org/P73220 and previous config saved to /var/cache/conftool/dbconfig/20250205-110731-fceratto.json
  • 11:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1179', diff saved to https://phabricator.wikimedia.org/P73219 and previous config saved to /var/cache/conftool/dbconfig/20250205-110628-marostegui.json
  • 11:03 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1202.eqiad.wmnet with reason: Index rebuild
  • 11:03 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2221.codfw.wmnet with reason: Index rebuild
  • 10:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73218 and previous config saved to /var/cache/conftool/dbconfig/20250205-105928-root.json
  • 10:51 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db1237.eqiad.wmnet onto db1179.eqiad.wmnet
  • 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1237', diff saved to https://phabricator.wikimedia.org/P73217 and previous config saved to /var/cache/conftool/dbconfig/20250205-104742-marostegui.json
  • 10:47 fceratto@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Repooling after cloning db1251', diff saved to https://phabricator.wikimedia.org/P73216 and previous config saved to /var/cache/conftool/dbconfig/20250205-104732-fceratto.json
  • 10:45 fceratto@cumin1002: dbctl commit (dc=all): 'db1251 (re)pooling @ 5%: Pooling host to 5%', diff saved to https://phabricator.wikimedia.org/P73215 and previous config saved to /var/cache/conftool/dbconfig/20250205-104543-fceratto.json
  • 10:45 urbanecm@deploy2002: Finished scap sync-world: Backport for fix(AddLink): button should show after link preview (T385542) (duration: 12m 15s)
  • 10:44 marostegui@cumin1002: dbctl commit (dc=all): 'db1179 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73214 and previous config saved to /var/cache/conftool/dbconfig/20250205-104423-root.json
  • 10:43 marostegui: Set x1 to SBR for a bit T385645
  • 10:39 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1179', diff saved to https://phabricator.wikimedia.org/P73213 and previous config saved to /var/cache/conftool/dbconfig/20250205-103738-marostegui.json
  • 10:36 urbanecm@deploy2002: urbanecm: Backport for fix(AddLink): button should show after link preview (T385542) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 10:33 urbanecm@deploy2002: Started scap sync-world: Backport for fix(AddLink): button should show after link preview (T385542)
  • 10:32 fceratto@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: Repooling after cloning db1251', diff saved to https://phabricator.wikimedia.org/P73212 and previous config saved to /var/cache/conftool/dbconfig/20250205-103227-fceratto.json
  • 10:30 klausman@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
  • 10:29 klausman@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: apply
  • 10:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T385645)', diff saved to https://phabricator.wikimedia.org/P73211 and previous config saved to /var/cache/conftool/dbconfig/20250205-102758-marostegui.json
  • 10:27 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1202.eqiad.wmnet
  • 10:27 klausman: pushing Changeprop patch (k8s values) https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1117063
  • 10:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1179 (T385645)', diff saved to https://phabricator.wikimedia.org/P73210 and previous config saved to /var/cache/conftool/dbconfig/20250205-102650-marostegui.json
  • 10:26 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 10:26 fceratto@cumin1002: dbctl commit (dc=all): 'db1251 (re)pooling @ 1%: Pooling in new host', diff saved to https://phabricator.wikimedia.org/P73209 and previous config saved to /var/cache/conftool/dbconfig/20250205-102614-fceratto.json
  • 10:25 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2221.codfw.wmnet
  • 10:20 fceratto@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1251.eqiad.wmnet
  • 10:20 fceratto@cumin1002: START - Cookbook sre.hosts.remove-downtime for db1251.eqiad.wmnet
  • 10:20 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1202.eqiad.wmnet
  • 10:20 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2221.codfw.wmnet
  • 10:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1202, db2221 for index rebuild', diff saved to https://phabricator.wikimedia.org/P73208 and previous config saved to /var/cache/conftool/dbconfig/20250205-102012-marostegui.json
  • 10:18 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 10:17 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 10:17 fceratto@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: Repooling after cloning db1251', diff saved to https://phabricator.wikimedia.org/P73207 and previous config saved to /var/cache/conftool/dbconfig/20250205-101721-fceratto.json
  • 10:14 dcausse: restarting blazegraph on wdqs1012 (BlazegraphFreeAllocatorsDecreasingRapidly)
  • 10:02 fceratto@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: Repooling after cloning db1251', diff saved to https://phabricator.wikimedia.org/P73205 and previous config saved to /var/cache/conftool/dbconfig/20250205-100216-fceratto.json
  • 09:58 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ms-be2051.codfw.wmnet with reason: disk failed, due decom soon
  • 09:56 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on ms-be2075.codfw.wmnet with reason: hardware broken awaiting vendor action
  • 09:55 kevinbazira@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:52 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 09:47 fceratto@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 10%: Repooling after cloning db1251', diff saved to https://phabricator.wikimedia.org/P73203 and previous config saved to /var/cache/conftool/dbconfig/20250205-094711-fceratto.json
  • 09:46 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1018.eqiad.wmnet with reason: Rebuild tables
  • 09:39 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Index rebuild
  • 09:38 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1156.eqiad.wmnet
  • 09:32 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Rebuild tables
  • 09:32 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1014.eqiad.wmnet with reason: Rebuild tables
  • 09:32 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1156.eqiad.wmnet
  • 09:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1156 for index rebuild', diff saved to https://phabricator.wikimedia.org/P73202 and previous config saved to /var/cache/conftool/dbconfig/20250205-093152-marostegui.json
  • 09:31 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[1155-1156].eqiad.wmnet with reason: Rebuild tables
  • 09:13 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.15 refs T382366
  • 06:53 eileen: civicrm upgraded from 5e01bd21 to d027bc7b
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1247 (T384592)', diff saved to https://phabricator.wikimedia.org/P73201 and previous config saved to /var/cache/conftool/dbconfig/20250205-063911-marostegui.json
  • 06:39 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 05:50 kart_: Updated cxserver to 2025-02-03-095815-production (T377966, T385185)
  • 05:49 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:49 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:44 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:43 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:31 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:31 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 03:04 eileen: config revision changed from f6bc2c51 to f1416f7a
  • 02:45 eileen: civicrm upgraded from ab392bd2 to 5e01bd21
  • 02:34 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 02:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T384592)', diff saved to https://phabricator.wikimedia.org/P73200 and previous config saved to /var/cache/conftool/dbconfig/20250205-023428-marostegui.json
  • 02:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P73199 and previous config saved to /var/cache/conftool/dbconfig/20250205-021921-marostegui.json
  • 02:13 eileen: civicrm upgraded from b869d0c3 to ab392bd2
  • 02:12 wfan: donorwiki revision changed from a039cd50 to 98027151
  • 02:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P73198 and previous config saved to /var/cache/conftool/dbconfig/20250205-020414-marostegui.json
  • 02:02 eileen: config revision changed from dbf6e86a to f6bc2c51
  • 01:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T384592)', diff saved to https://phabricator.wikimedia.org/P73197 and previous config saved to /var/cache/conftool/dbconfig/20250205-014907-marostegui.json
  • 01:28 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Dyolf77 /tmp/uploads # T385642
  • 00:30 eileen: civicrm upgraded from abe0fc61 to b869d0c3
  • 00:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T371742)', diff saved to https://phabricator.wikimedia.org/P73196 and previous config saved to /var/cache/conftool/dbconfig/20250205-001309-ladsgroup.json

2025-02-04

  • 23:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P73195 and previous config saved to /var/cache/conftool/dbconfig/20250204-235802-ladsgroup.json
  • 23:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P73194 and previous config saved to /var/cache/conftool/dbconfig/20250204-234255-ladsgroup.json
  • 23:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T371742)', diff saved to https://phabricator.wikimedia.org/P73193 and previous config saved to /var/cache/conftool/dbconfig/20250204-232748-ladsgroup.json
  • 22:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1222 (T371742)', diff saved to https://phabricator.wikimedia.org/P73192 and previous config saved to /var/cache/conftool/dbconfig/20250204-223744-ladsgroup.json
  • 22:37 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 22:35 ladsgroup@deploy2002: Synchronized portals: Bump portals to HEAD (duration: 03m 12s)
  • 22:32 ladsgroup@deploy2002: Synchronized portals/wikipedia.org/assets: Bump portals to HEAD (T368221 T373204) (duration: 09m 30s)
  • 22:18 ladsgroup@deploy2002: Finished scap sync-world: Backport for Set categorylinks to write both in group0 (T385164) (duration: 13m 20s)
  • 22:12 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 22:10 ladsgroup@deploy2002: ladsgroup: Backport for Set categorylinks to write both in group0 (T385164) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:05 ladsgroup@deploy2002: Started scap sync-world: Backport for Set categorylinks to write both in group0 (T385164)
  • 22:01 ladsgroup@deploy2002: Finished scap sync-world: Backport for Set file migration to write both everywhere except commons and enwiki (T384481) (duration: 11m 01s)
  • 21:55 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 21:53 ladsgroup@deploy2002: ladsgroup: Backport for Set file migration to write both everywhere except commons and enwiki (T384481) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:50 ladsgroup@deploy2002: Started scap sync-world: Backport for Set file migration to write both everywhere except commons and enwiki (T384481)
  • 21:48 jforrester@deploy2002: Finished scap sync-world: Backport for Drop old wikifunctions.ui event stream, replaced by ….wikifunctions_ui (T369949) (duration: 17m 43s)
  • 21:42 jforrester@deploy2002: jforrester: Continuing with sync
  • 21:36 jforrester@deploy2002: jforrester: Backport for Drop old wikifunctions.ui event stream, replaced by ….wikifunctions_ui (T369949) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:30 jforrester@deploy2002: Started scap sync-world: Backport for Drop old wikifunctions.ui event stream, replaced by ….wikifunctions_ui (T369949)
  • 21:29 jforrester@deploy2002: Finished scap sync-world: Backport for Parsoid fragment support: fix handling of 'nowiki' and 'general' strip markers (duration: 16m 39s)
  • 21:22 jforrester@deploy2002: cscott, jforrester: Continuing with sync
  • 21:17 jforrester@deploy2002: cscott, jforrester: Backport for Parsoid fragment support: fix handling of 'nowiki' and 'general' strip markers synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:12 jforrester@deploy2002: Started scap sync-world: Backport for Parsoid fragment support: fix handling of 'nowiki' and 'general' strip markers
  • 21:09 jforrester@deploy2002: Finished scap sync-world: Backport for [wikifunctionswiki] Set flags for repo mode (on) and client (off) (duration: 09m 56s)
  • 21:03 jforrester@deploy2002: jforrester: Continuing with sync
  • 21:02 jforrester@deploy2002: jforrester: Backport for [wikifunctionswiki] Set flags for repo mode (on) and client (off) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:59 jforrester@deploy2002: Started scap sync-world: Backport for [wikifunctionswiki] Set flags for repo mode (on) and client (off)
  • 20:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1244 (T384592)', diff saved to https://phabricator.wikimedia.org/P73191 and previous config saved to /var/cache/conftool/dbconfig/20250204-203754-marostegui.json
  • 20:37 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 20:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T384592)', diff saved to https://phabricator.wikimedia.org/P73190 and previous config saved to /var/cache/conftool/dbconfig/20250204-203732-marostegui.json
  • 20:36 swfrench-wmf: finished running puppet on A:cp-text after merging https://gerrit.wikimedia.org/r/1084247
  • 20:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P73189 and previous config saved to /var/cache/conftool/dbconfig/20250204-202225-marostegui.json
  • 20:09 swfrench-wmf: running puppet on A:cp-text after merging https://gerrit.wikimedia.org/r/1084247
  • 20:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P73188 and previous config saved to /var/cache/conftool/dbconfig/20250204-200718-marostegui.json
  • 20:07 swfrench-wmf: verified behavior of https://gerrit.wikimedia.org/r/1084247 on cp4040
  • 19:59 swfrench-wmf: disabled puppet on A:cp-text before merging https://gerrit.wikimedia.org/r/1084247
  • 19:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T384592)', diff saved to https://phabricator.wikimedia.org/P73187 and previous config saved to /var/cache/conftool/dbconfig/20250204-195211-marostegui.json
  • 18:42 swfrench-wmf: mw-api-int to ~ 2% of traffic on PHP 8.1 - T383845
  • 18:40 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 18:39 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 18:39 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 18:39 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 18:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 18:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 18:36 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 18:35 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 18:31 cwhite@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
  • 18:31 cwhite@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
  • 18:30 cwhite@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
  • 18:30 cwhite@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
  • 18:20 swfrench@deploy2002: Finished scap sync-world: Backport for Enroll 25% of client sessions in PHP 8.1 (T383845) (duration: 11m 25s)
  • 18:13 swfrench@deploy2002: swfrench: Continuing with sync
  • 18:12 swfrench@deploy2002: swfrench: Backport for Enroll 25% of client sessions in PHP 8.1 (T383845) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:09 swfrench@deploy2002: Started scap sync-world: Backport for Enroll 25% of client sessions in PHP 8.1 (T383845)
  • 18:05 swfrench-wmf: scaled mw-api-ext next to 15% of main release - T383845
  • 18:04 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 18:04 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 18:04 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 18:03 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 18:02 swfrench-wmf: scaled mw-web next to 15% of main release - T383845
  • 18:01 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 18:01 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 18:00 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 18:00 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 17:53 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 17:51 vgutierrez: repooling lvs4008 - T384477
  • 17:51 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 17:47 mutante: codesearch.wmflabs.org - hard reboot instance for needed mass reboots in cloud VPS
  • 17:47 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:42 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:40 vgutierrez@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS bookworm
  • 17:18 vgutierrez@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
  • 17:17 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:16 vgutierrez@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
  • 17:14 marostegui@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73186 and previous config saved to /var/cache/conftool/dbconfig/20250204-171415-root.json
  • 17:12 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:11 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:01 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:59 marostegui@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73185 and previous config saved to /var/cache/conftool/dbconfig/20250204-165909-root.json
  • 16:58 vgutierrez@cumin1002: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS bookworm
  • 16:56 vgutierrez@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs4008.ulsfo.wmnet with OS bookworm
  • 16:51 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:50 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:49 elukey@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:48 elukey@cumin1002: START - Cookbook sre.hosts.provision for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73184 and previous config saved to /var/cache/conftool/dbconfig/20250204-164802-root.json
  • 16:44 marostegui@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73183 and previous config saved to /var/cache/conftool/dbconfig/20250204-164405-root.json
  • 16:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73182 and previous config saved to /var/cache/conftool/dbconfig/20250204-163256-root.json
  • 16:29 marostegui@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73181 and previous config saved to /var/cache/conftool/dbconfig/20250204-162900-root.json
  • 16:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73180 and previous config saved to /var/cache/conftool/dbconfig/20250204-161751-root.json
  • 16:17 topranks: disable et-0/0/0 on cr3-ulsfo to prep for optic replacement T384288
  • 16:17 topranks: disable et-0/0/0 on cr3-ulsfo to prep for optic replacement
  • 16:16 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: replace faulty optic et-0/0/0
  • 16:13 marostegui@cumin1002: dbctl commit (dc=all): 'db2222 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73179 and previous config saved to /var/cache/conftool/dbconfig/20250204-161355-root.json
  • 16:12 reedy@deploy2002: Finished scap sync-world: Backport for Poem: Null coalescence $in (T385588), Poem: Null coalescence $in (T385588), Hooks: Check for null option in onSpecialMuteModifyFormFields (T385169), Hooks: Check for null option in onSpecialMuteModifyFormFields (T385169) (duration: 09m 50s)
  • 16:06 reedy@deploy2002: reedy: Continuing with sync
  • 16:06 reedy@deploy2002: reedy: Backport for Poem: Null coalescence $in (T385588), Poem: Null coalescence $in (T385588), Hooks: Check for null option in onSpecialMuteModifyFormFields (T385169), Hooks: Check for null option in onSpecialMuteModifyFormFields (T385169) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:03 reedy@deploy2002: Started scap sync-world: Backport for Poem: Null coalescence $in (T385588), Poem: Null coalescence $in (T385588), Hooks: Check for null option in onSpecialMuteModifyFormFields (T385169), Hooks: Check for null option in onSpecialMuteModifyFormFields (T385169)
  • 16:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73178 and previous config saved to /var/cache/conftool/dbconfig/20250204-160246-root.json
  • 16:00 herron@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) _etcd-server-ssl._tcp.aux-k8s-etcd.codfw.wmnet on all recursors
  • 16:00 herron@cumin1002: START - Cookbook sre.dns.wipe-cache _etcd-server-ssl._tcp.aux-k8s-etcd.codfw.wmnet on all recursors
  • 15:56 vgutierrez@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
  • 15:53 vgutierrez@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
  • 15:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:50 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ganeti1053.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 15:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1227 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73177 and previous config saved to /var/cache/conftool/dbconfig/20250204-154740-root.json
  • 15:44 herron@dns1004: END - running authdns-update
  • 15:44 mszabo@deploy2002: Finished scap sync-world: Backport for Remove flag wgSecurePollSingleTransferableVoteEnabled (T376930) (duration: 11m 21s)
  • 15:42 herron@dns1004: START - running authdns-update
  • 15:37 mszabo@deploy2002: mimurawil, mszabo: Continuing with sync
  • 15:36 mszabo@deploy2002: mimurawil, mszabo: Backport for Remove flag wgSecurePollSingleTransferableVoteEnabled (T376930) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:35 vgutierrez@cumin1002: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS bookworm
  • 15:33 mszabo@deploy2002: Started scap sync-world: Backport for Remove flag wgSecurePollSingleTransferableVoteEnabled (T376930)
  • 15:32 vgutierrez: reimaging lvs4008 as a liberica LB - T384477
  • 15:29 mszabo@deploy2002: Finished scap sync-world: Backport for Remove flag $wgSecurePollSingleTransferableVoteEnabled (T376930), Remove flag $wgSecurePollSingleTransferableVoteEnabled (T376930) (duration: 13m 46s)
  • 15:23 mszabo@deploy2002: mszabo: Continuing with sync
  • 15:20 mszabo@deploy2002: mszabo: Backport for Remove flag $wgSecurePollSingleTransferableVoteEnabled (T376930), Remove flag $wgSecurePollSingleTransferableVoteEnabled (T376930) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:16 mszabo@deploy2002: Started scap sync-world: Backport for Remove flag $wgSecurePollSingleTransferableVoteEnabled (T376930), Remove flag $wgSecurePollSingleTransferableVoteEnabled (T376930)
  • 15:14 herron@dns1004: END - running authdns-update
  • 15:12 herron@dns1004: START - running authdns-update
  • 15:06 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Avoid PHP Notice on missing entityschema-meta-tags (T385272), Avoid PHP Notice on missing entityschema-meta-tags (T385272) (duration: 10m 56s)
  • 14:59 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Continuing with sync
  • 14:59 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Avoid PHP Notice on missing entityschema-meta-tags (T385272), Avoid PHP Notice on missing entityschema-meta-tags (T385272) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:55 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Avoid PHP Notice on missing entityschema-meta-tags (T385272), Avoid PHP Notice on missing entityschema-meta-tags (T385272)
  • 14:54 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for EventStreamConfig: Add mediawiki.article_country_prediction_change stream (T382295) (duration: 16m 23s)
  • 14:46 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, kevinbazira: Continuing with sync
  • 14:43 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, kevinbazira: Backport for EventStreamConfig: Add mediawiki.article_country_prediction_change stream (T382295) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:38 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for EventStreamConfig: Add mediawiki.article_country_prediction_change stream (T382295)
  • 14:36 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for kowikisource: Add Draft namespace (T385162) (duration: 29m 05s)
  • 14:26 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, revi: Continuing with sync
  • 14:25 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, revi: Backport for kowikisource: Add Draft namespace (T385162) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:10 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: sync
  • 14:10 elukey@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: sync
  • 14:09 Lucas_WMDE: lucaswerkmeister-wmde@deploy2002 Started scap sync-world: Backport for kowikisource: Add Draft namespace (T385162) # re-log from 14:07 UTC
  • 13:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1229 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73176 and previous config saved to /var/cache/conftool/dbconfig/20250204-134646-root.json
  • 13:44 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1004.eqiad.wmnet with OS bullseye
  • 13:35 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1002.eqiad.wmnet with OS bookworm
  • 13:31 marostegui@cumin1002: dbctl commit (dc=all): 'db1229 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73175 and previous config saved to /var/cache/conftool/dbconfig/20250204-133141-root.json
  • 13:27 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1004.eqiad.wmnet with reason: host reimage
  • 13:23 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1004.eqiad.wmnet with reason: host reimage
  • 13:17 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1002.eqiad.wmnet with reason: host reimage
  • 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'db1229 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73174 and previous config saved to /var/cache/conftool/dbconfig/20250204-131636-root.json
  • 13:14 andrew@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1002.eqiad.wmnet with reason: host reimage
  • 13:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73173 and previous config saved to /var/cache/conftool/dbconfig/20250204-131118-root.json
  • 13:09 godog: upgrade poolcounter-prometheus-exporter to 0.1.2 - T333947
  • 13:07 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudgw1004.eqiad.wmnet with OS bullseye
  • 13:04 aborrero@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudgw1004.eqiad.wmnet with OS bookworm
  • 12:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1229 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73168 and previous config saved to /var/cache/conftool/dbconfig/20250204-124625-root.json
  • 12:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P73167 and previous config saved to /var/cache/conftool/dbconfig/20250204-124345-marostegui.json
  • 12:41 marostegui@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73166 and previous config saved to /var/cache/conftool/dbconfig/20250204-124107-root.json
  • 12:40 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:39 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:38 jynus: deploying new backup grants for ES hosts T383902
  • 12:33 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 12:32 jiji@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 12:28 vgutierrez@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs2011.codfw.wmnet,lvs6001.drmrs.wmnet,lvs1017.eqiad.wmnet,lvs3008.esams.wmnet,lvs7001.magru.wmnet} and A:lvs (T373027)
  • 12:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P73165 and previous config saved to /var/cache/conftool/dbconfig/20250204-122838-marostegui.json
  • 12:27 vgutierrez@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs2011.codfw.wmnet,lvs6001.drmrs.wmnet,lvs1017.eqiad.wmnet,lvs3008.esams.wmnet,lvs7001.magru.wmnet} and A:lvs (T373027)
  • 12:26 vgutierrez: upgrading pybal on high-traffic1 load balancers - T373027
  • 12:26 marostegui@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73164 and previous config saved to /var/cache/conftool/dbconfig/20250204-122602-root.json
  • 12:25 vgutierrez@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs2012.codfw.wmnet,lvs6002.drmrs.wmnet,lvs1018.eqiad.wmnet,lvs3009.esams.wmnet,lvs7002.magru.wmnet} and A:lvs (T373027)
  • 12:24 vgutierrez@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs2012.codfw.wmnet,lvs6002.drmrs.wmnet,lvs1018.eqiad.wmnet,lvs3009.esams.wmnet,lvs7002.magru.wmnet} and A:lvs (T373027)
  • 12:23 vgutierrez: upgrading pybal on high-traffic2 load balancers - T373027
  • 12:23 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2222.codfw.wmnet with reason: Index rebuild
  • 12:21 vgutierrez@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-low-traffic (T373027)
  • 12:20 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2222.codfw.wmnet
  • 12:20 vgutierrez@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-low-traffic (T373027)
  • 12:18 vgutierrez: upgrading pybal on low-traffic load balancers - T373027
  • 12:17 vgutierrez@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs2014.codfw.wmnet,lvs6003.drmrs.wmnet,lvs1020.eqiad.wmnet,lvs3010.esams.wmnet,lvs7003.magru.wmnet} and A:lvs (T373027)
  • 12:15 vgutierrez@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs2014.codfw.wmnet,lvs6003.drmrs.wmnet,lvs1020.eqiad.wmnet,lvs3010.esams.wmnet,lvs7003.magru.wmnet} and A:lvs (T373027)
  • 12:15 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2222.codfw.wmnet
  • 12:15 vgutierrez: upgrading pybal on secondary load balancers - T373027
  • 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2222 for index rebuild', diff saved to https://phabricator.wikimedia.org/P73163 and previous config saved to /var/cache/conftool/dbconfig/20250204-121450-marostegui.json
  • 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'es2040 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73162 and previous config saved to /var/cache/conftool/dbconfig/20250204-121400-root.json
  • 12:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T384592)', diff saved to https://phabricator.wikimedia.org/P73161 and previous config saved to /var/cache/conftool/dbconfig/20250204-121331-marostegui.json
  • 12:11 vgutierrez@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs500[4-5]*} and A:lvs (T373027)
  • 12:10 marostegui@cumin1002: dbctl commit (dc=all): 'db2220 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73160 and previous config saved to /var/cache/conftool/dbconfig/20250204-121056-root.json
  • 12:10 vgutierrez@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs500[4-5]*} and A:lvs (T373027)
  • 12:07 elukey: manually executed docker-system-prune-dangling.service on build2001
  • 12:04 elukey: manually dropped 2.5.1rocm6.2-1-20250202 on build2001 - T385531
  • 12:03 vgutierrez: upgrading pybal on eqsin - T373027
  • 11:59 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 11:59 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 11:58 marostegui@cumin1002: dbctl commit (dc=all): 'es2040 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73158 and previous config saved to /var/cache/conftool/dbconfig/20250204-115855-root.json
  • 11:54 vgutierrez: uploaded pybal 1.15.15 to apt.wm.o (bullseye-wikimedia) T373027
  • 11:54 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Index rebuild
  • 11:54 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1227.eqiad.wmnet
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73157 and previous config saved to /var/cache/conftool/dbconfig/20250204-115323-root.json
  • 11:48 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1227.eqiad.wmnet
  • 11:48 jynus: deploying new backup grants for matomo and analytics_meta T383902
  • 11:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1227 for index rebuild', diff saved to https://phabricator.wikimedia.org/P73156 and previous config saved to /var/cache/conftool/dbconfig/20250204-114808-marostegui.json
  • 11:43 marostegui@cumin1002: dbctl commit (dc=all): 'es2040 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73155 and previous config saved to /var/cache/conftool/dbconfig/20250204-114350-root.json
  • 11:41 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 11:39 jiji@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 11:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73154 and previous config saved to /var/cache/conftool/dbconfig/20250204-113818-root.json
  • 11:34 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 11:33 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 11:33 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
  • 11:31 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
  • 11:28 marostegui@cumin1002: dbctl commit (dc=all): 'es2040 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73153 and previous config saved to /var/cache/conftool/dbconfig/20250204-112844-root.json
  • 11:28 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
  • 11:26 jiji@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
  • 11:23 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73152 and previous config saved to /var/cache/conftool/dbconfig/20250204-112313-root.json
  • 11:22 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:22 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:20 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:20 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 11:18 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 11:17 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:17 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:13 marostegui@cumin1002: dbctl commit (dc=all): 'es2040 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73151 and previous config saved to /var/cache/conftool/dbconfig/20250204-111337-root.json
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1197 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73150 and previous config saved to /var/cache/conftool/dbconfig/20250204-110830-root.json
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73149 and previous config saved to /var/cache/conftool/dbconfig/20250204-110808-root.json
  • 11:03 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Index rebuild
  • 11:01 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1229.eqiad.wmnet
  • 10:59 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for es2040.codfw.wmnet
  • 10:56 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1229.eqiad.wmnet
  • 10:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1229 for index rebuild', diff saved to https://phabricator.wikimedia.org/P73148 and previous config saved to /var/cache/conftool/dbconfig/20250204-105546-marostegui.json
  • 10:54 root@cumin1002: START - Cookbook sre.mysql.upgrade for es2040.codfw.wmnet
  • 10:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2040 for kernel reboot', diff saved to https://phabricator.wikimedia.org/P73147 and previous config saved to /var/cache/conftool/dbconfig/20250204-105411-marostegui.json
  • 10:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1197 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73146 and previous config saved to /var/cache/conftool/dbconfig/20250204-105323-root.json
  • 10:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1236 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73145 and previous config saved to /var/cache/conftool/dbconfig/20250204-105302-root.json
  • 10:44 Amir1: foreachwiki sql.php /srv/mediawiki/php-1.44.0-wmf.14/sql/mysql/patch-collation.sql (T384592)
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1197 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73144 and previous config saved to /var/cache/conftool/dbconfig/20250204-103818-root.json
  • 10:32 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 10:32 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 10:24 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1197 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73143 and previous config saved to /var/cache/conftool/dbconfig/20250204-102313-root.json
  • 10:22 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 10:15 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
  • 10:15 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: apply
  • 10:13 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: apply
  • 10:13 elukey: depool maps1006 from all services to run perf tests
  • 10:13 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
  • 10:13 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
  • 10:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1197 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73141 and previous config saved to /var/cache/conftool/dbconfig/20250204-100807-root.json
  • 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'es2039 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73140 and previous config saved to /var/cache/conftool/dbconfig/20250204-094344-root.json
  • 09:41 slyngshede@dns1004: END - running authdns-update
  • 09:39 slyngshede@dns1004: START - running authdns-update
  • 09:39 slyngshede@dns1004: START - running authdns-update
  • 09:39 slyngshede@dns1004: START - running authdns-update
  • 09:38 urbanecm: mwmaint2002: Kill `mediawiki_job_growthexperiments-refreshLinkRecommendations-s6[6640]` to pick new config (T378527)
  • 09:34 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
  • 09:33 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
  • 09:28 marostegui@cumin1002: dbctl commit (dc=all): 'es2039 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73139 and previous config saved to /var/cache/conftool/dbconfig/20250204-092838-root.json
  • 09:28 marostegui@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73138 and previous config saved to /var/cache/conftool/dbconfig/20250204-092759-root.json
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73137 and previous config saved to /var/cache/conftool/dbconfig/20250204-092232-root.json
  • 09:18 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.44.0-wmf.15 refs T382366
  • 09:13 marostegui@cumin1002: dbctl commit (dc=all): 'es2039 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73136 and previous config saved to /var/cache/conftool/dbconfig/20250204-091334-root.json
  • 09:13 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73135 and previous config saved to /var/cache/conftool/dbconfig/20250204-091254-root.json
  • 09:12 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73134 and previous config saved to /var/cache/conftool/dbconfig/20250204-090726-root.json
  • 09:04 urbanecm@deploy2002: Finished scap sync-world: Backport for Move link recommendation minimum tasks per topic to PHP configuration (T383714) (duration: 17m 28s)
  • 09:03 fceratto@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1169.eqiad.wmnet
  • 09:03 fceratto@cumin1002: START - Cookbook sre.hosts.remove-downtime for db1169.eqiad.wmnet
  • 08:58 marostegui@cumin1002: dbctl commit (dc=all): 'es2039 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73132 and previous config saved to /var/cache/conftool/dbconfig/20250204-085828-root.json
  • 08:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73131 and previous config saved to /var/cache/conftool/dbconfig/20250204-085749-root.json
  • 08:57 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Repooling after clone - T383760
  • 08:56 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 08:53 urbanecm@deploy2002: urbanecm: Backport for Move link recommendation minimum tasks per topic to PHP configuration (T383714) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:52 XioNoX: push pfw policies T384885
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73130 and previous config saved to /var/cache/conftool/dbconfig/20250204-085221-root.json
  • 08:46 urbanecm@deploy2002: Started scap sync-world: Backport for Move link recommendation minimum tasks per topic to PHP configuration (T383714)
  • 08:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73129 and previous config saved to /var/cache/conftool/dbconfig/20250204-084244-root.json
  • 08:40 marostegui@cumin1002: dbctl commit (dc=all): 'es2039 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73128 and previous config saved to /var/cache/conftool/dbconfig/20250204-084052-root.json
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73127 and previous config saved to /var/cache/conftool/dbconfig/20250204-083716-root.json
  • 08:36 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2039.codfw.wmnet with reason: Rebuild tables
  • 08:34 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for es2039.codfw.wmnet
  • 08:30 urbanecm@deploy2002: Finished scap sync-world: Backport for Add configurable MinimumTasksPerTopic (T383714), [Growth] Increase minimum tasks per topic to 2000 for eswiki, frwiki (T378527) (duration: 25m 56s)
  • 08:29 root@cumin1002: START - Cookbook sre.mysql.upgrade for es2039.codfw.wmnet
  • 08:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2039 for kernel reboot', diff saved to https://phabricator.wikimedia.org/P73125 and previous config saved to /var/cache/conftool/dbconfig/20250204-082912-marostegui.json
  • 08:27 marostegui@cumin1002: dbctl commit (dc=all): 'db2227 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73124 and previous config saved to /var/cache/conftool/dbconfig/20250204-082738-root.json
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2209 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73123 and previous config saved to /var/cache/conftool/dbconfig/20250204-082210-root.json
  • 08:19 urbanecm@deploy2002: urbanecm, cyndywikime: Continuing with sync
  • 08:18 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1236.eqiad.wmnet with reason: Index rebuild
  • 08:18 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2220.codfw.wmnet with reason: Index rebuild
  • 08:18 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2220.codfw.wmnet
  • 08:17 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1236.eqiad.wmnet
  • 08:15 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2227.codfw.wmnet onto db2209.codfw.wmnet
  • 08:13 urbanecm@deploy2002: urbanecm, cyndywikime: Backport for Add configurable MinimumTasksPerTopic (T383714), [Growth] Increase minimum tasks per topic to 2000 for eswiki, frwiki (T378527) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:12 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1236.eqiad.wmnet
  • 08:12 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2220.codfw.wmnet
  • 08:12 root@cumin1002: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 12:00:00 on db2220.codfw.wmnet with reason: Index rebuild
  • 08:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2220', diff saved to https://phabricator.wikimedia.org/P73122 and previous config saved to /var/cache/conftool/dbconfig/20250204-081151-marostegui.json
  • 08:11 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1236.eqiad.wmnet with reason: Index rebuild
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1236', diff saved to https://phabricator.wikimedia.org/P73121 and previous config saved to /var/cache/conftool/dbconfig/20250204-081056-marostegui.json
  • 08:04 urbanecm@deploy2002: Started scap sync-world: Backport for Add configurable MinimumTasksPerTopic (T383714), [Growth] Increase minimum tasks per topic to 2000 for eswiki, frwiki (T378527)
  • 08:02 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Index rebuild
  • 08:01 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1197.eqiad.wmnet
  • 07:54 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1197.eqiad.wmnet
  • 07:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1197', diff saved to https://phabricator.wikimedia.org/P73120 and previous config saved to /var/cache/conftool/dbconfig/20250204-075440-marostegui.json
  • 07:49 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Rebuild tables
  • 07:48 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1013.eqiad.wmnet with reason: Rebuild tables
  • 06:45 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db2227.codfw.wmnet onto db2209.codfw.wmnet
  • 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2227 to clone db2209', diff saved to https://phabricator.wikimedia.org/P73119 and previous config saved to /var/cache/conftool/dbconfig/20250204-064425-marostegui.json
  • 06:26 kartik@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 05:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1242 (T384592)', diff saved to https://phabricator.wikimedia.org/P73118 and previous config saved to /var/cache/conftool/dbconfig/20250204-054505-marostegui.json
  • 05:44 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 05:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T384592)', diff saved to https://phabricator.wikimedia.org/P73117 and previous config saved to /var/cache/conftool/dbconfig/20250204-054443-marostegui.json
  • 05:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P73116 and previous config saved to /var/cache/conftool/dbconfig/20250204-052936-marostegui.json
  • 05:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P73115 and previous config saved to /var/cache/conftool/dbconfig/20250204-051429-marostegui.json
  • 05:04 mwpresync@deploy2002: Pruned MediaWiki: 1.44.0-wmf.12 (duration: 04m 49s)
  • 05:01 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.44.0-wmf.15 refs T382366 (duration: 58m 53s)
  • 04:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T384592)', diff saved to https://phabricator.wikimedia.org/P73114 and previous config saved to /var/cache/conftool/dbconfig/20250204-045922-marostegui.json
  • 04:02 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.44.0-wmf.15 refs T382366
  • 03:19 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1054
  • 03:16 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1054
  • 03:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1053
  • 02:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1053
  • 02:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 02:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 02:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1053
  • 02:42 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1053
  • 02:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 02:40 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 02:06 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1054
  • 02:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1054
  • 02:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1053
  • 02:02 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1053
  • 02:01 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 01:58 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 01:12 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 01:11 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add dns entries for new frack nodes - pt1979@cumin2002"
  • 01:11 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add dns entries for new frack nodes - pt1979@cumin2002"
  • 01:08 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 00:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1241 (T384592)', diff saved to https://phabricator.wikimedia.org/P73112 and previous config saved to /var/cache/conftool/dbconfig/20250204-000010-marostegui.json
  • 00:00 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1241.eqiad.wmnet with reason: Maintenance

2025-02-03

  • 23:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T384592)', diff saved to https://phabricator.wikimedia.org/P73111 and previous config saved to /var/cache/conftool/dbconfig/20250203-235947-marostegui.json
  • 23:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P73110 and previous config saved to /var/cache/conftool/dbconfig/20250203-234440-marostegui.json
  • 23:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P73109 and previous config saved to /var/cache/conftool/dbconfig/20250203-232933-marostegui.json
  • 23:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T384592)', diff saved to https://phabricator.wikimedia.org/P73108 and previous config saved to /var/cache/conftool/dbconfig/20250203-231428-marostegui.json
  • 23:10 dwisehaupt@dns1004: END - running authdns-update
  • 23:09 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:09 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt franio1002 - vriley@cumin1002"
  • 23:09 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt franio1002 - vriley@cumin1002"
  • 23:08 dwisehaupt@dns1004: START - running authdns-update
  • 23:01 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 22:44 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:44 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt franio1001 - vriley@cumin1002"
  • 22:43 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt franio1001 - vriley@cumin1002"
  • 22:39 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 21:15 urbanecm@deploy2002: Finished scap sync-world: Backport for [Growth] enwiki: Enable mentorship for 75% of new accounts (T384505) (duration: 10m 22s)
  • 21:08 urbanecm@deploy2002: urbanecm: Continuing with sync
  • 21:08 urbanecm@deploy2002: urbanecm: Backport for [Growth] enwiki: Enable mentorship for 75% of new accounts (T384505) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:04 urbanecm@deploy2002: Started scap sync-world: Backport for [Growth] enwiki: Enable mentorship for 75% of new accounts (T384505)
  • 20:33 rzl@deploy2002: Finished scap sync-world: T383952, T384137 (duration: 06m 10s)
  • 20:32 rzl@deploy2002: rzl: Continuing with sync
  • 20:31 rzl@deploy2002: rzl: T383952, T384137 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:29 rzl@deploy2002: Started scap sync-world: T383952, T384137
  • 19:47 rzl@deploy2002: Started scap sync-world: T383952, T384137
  • 19:40 swfrench-wmf: ran reprepro include mercurius 1.1.0-1 - T385225
  • 19:36 ejegg: fundraising civicrm upgraded from 3e566467 to abe0fc61
  • 19:00 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:00 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt fransw1001 - vriley@cumin1002"
  • 19:00 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt fransw1001 - vriley@cumin1002"
  • 18:55 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 18:51 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 18:50 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 18:50 fceratto@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1169.eqiad.wmnet onto db1251.eqiad.wmnet
  • 18:46 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 18:45 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 18:41 swfrench-wmf: mw-api-int to ~ 1% of traffic on PHP 8.1 in codfw - T383845
  • 18:39 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 18:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 18:28 swfrench-wmf: mw-api-int to ~ 1% of traffic on PHP 8.1 in eqiad - T383845
  • 18:27 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 18:25 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 18:13 swfrench@deploy2002: Finished scap sync-world: Backport for Enroll 10% of client sessions in PHP 8.1 (T383845) (duration: 11m 13s)
  • 18:07 swfrench@deploy2002: swfrench: Continuing with sync
  • 18:06 swfrench@deploy2002: swfrench: Backport for Enroll 10% of client sessions in PHP 8.1 (T383845) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:02 swfrench@deploy2002: Started scap sync-world: Backport for Enroll 10% of client sessions in PHP 8.1 (T383845)
  • 18:01 urbanecm: [urbanecm@deploy2002 ~]$ mwscript-k8s -f -- extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=newiki --logwiki=metawiki 'Tarasssst' 'TR101' # T385503
  • 17:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1238 (T384592)', diff saved to https://phabricator.wikimedia.org/P73107 and previous config saved to /var/cache/conftool/dbconfig/20250203-175904-marostegui.json
  • 17:58 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 17:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T384592)', diff saved to https://phabricator.wikimedia.org/P73106 and previous config saved to /var/cache/conftool/dbconfig/20250203-175843-marostegui.json
  • 17:58 urbanecm: [urbanecm@deploy2002 ~]$ mwscript-k8s -f -- extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=newiki --logwiki=metawiki 'JOestby' 'Johannesoestby' # T385503
  • 17:47 cdanis@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
  • 17:46 cdanis@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
  • 17:46 cdanis@deploy2002: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
  • 17:46 cdanis@deploy2002: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
  • 17:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P73105 and previous config saved to /var/cache/conftool/dbconfig/20250203-174336-marostegui.json
  • 17:39 cdanis@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
  • 17:39 cdanis@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
  • 17:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1188 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73104 and previous config saved to /var/cache/conftool/dbconfig/20250203-173748-root.json
  • 17:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P73103 and previous config saved to /var/cache/conftool/dbconfig/20250203-172829-marostegui.json
  • 17:22 marostegui@cumin1002: dbctl commit (dc=all): 'db1188 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73102 and previous config saved to /var/cache/conftool/dbconfig/20250203-172243-root.json
  • 17:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T384592)', diff saved to https://phabricator.wikimedia.org/P73101 and previous config saved to /var/cache/conftool/dbconfig/20250203-171322-marostegui.json
  • 17:07 marostegui@cumin1002: dbctl commit (dc=all): 'db1188 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73100 and previous config saved to /var/cache/conftool/dbconfig/20250203-170737-root.json
  • 16:58 elukey@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: sync
  • 16:57 elukey@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: sync
  • 16:52 marostegui@cumin1002: dbctl commit (dc=all): 'db1188 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73099 and previous config saved to /var/cache/conftool/dbconfig/20250203-165232-root.json
  • 16:44 fceratto@cumin1002: START - Cookbook sre.mysql.clone of db1169.eqiad.wmnet onto db1251.eqiad.wmnet
  • 16:37 marostegui@cumin1002: dbctl commit (dc=all): 'db1188 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73097 and previous config saved to /var/cache/conftool/dbconfig/20250203-163727-root.json
  • 16:37 fceratto@cumin1002: dbctl commit (dc=all): 'Add db1251.eqiad.wmnet T385141', diff saved to https://phabricator.wikimedia.org/P73096 and previous config saved to /var/cache/conftool/dbconfig/20250203-163722-fceratto.json
  • 15:41 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1251.eqiad.wmnet with reason: provisioning - T385141
  • 15:40 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1169.eqiad.wmnet with reason: provisioning - T385141
  • 15:37 fceratto@cumin1002: dbctl commit (dc=all): 'Depool db1169.eqiad.wmnet T385141', diff saved to https://phabricator.wikimedia.org/P73093 and previous config saved to /var/cache/conftool/dbconfig/20250203-153755-fceratto.json
  • 14:42 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:33 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Enable VisualEditor EditCheck on dewiki (T385205) (duration: 10m 43s)
  • 14:26 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Index rebuild
  • 14:26 lucaswerkmeister-wmde@deploy2002: kemayo, lucaswerkmeister-wmde: Continuing with sync
  • 14:26 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1188.eqiad.wmnet
  • 14:26 lucaswerkmeister-wmde@deploy2002: kemayo, lucaswerkmeister-wmde: Backport for Enable VisualEditor EditCheck on dewiki (T385205) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:22 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Enable VisualEditor EditCheck on dewiki (T385205)
  • 14:20 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for Change "$wgUploadMissingFileUrl" for svwiktionary (T383452) (duration: 14m 42s)
  • 14:19 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1188.eqiad.wmnet
  • 14:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1188 T385084', diff saved to https://phabricator.wikimedia.org/P73091 and previous config saved to /var/cache/conftool/dbconfig/20250203-141939-marostegui.json
  • 14:13 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, dreamrimmer: Continuing with sync
  • 14:11 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, dreamrimmer: Backport for Change "$wgUploadMissingFileUrl" for svwiktionary (T383452) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:05 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for Change "$wgUploadMissingFileUrl" for svwiktionary (T383452)
  • 13:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 100%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73088 and previous config saved to /var/cache/conftool/dbconfig/20250203-134742-root.json
  • 13:43 reedy@deploy2002: Finished scap sync-world: Backport for Add missing array_values for PHP 7 compatibility (T385255), SpecialMathWikibase: Null-coalescence getDescription() call (T385170), SpecialMathWikibase: Null-coalescence $par (T385269), ApiQueryContentTranslationSuggestions: Set default value for to and from parameters (T385267) (duration
  • 13:34 reedy@deploy2002: reedy: Continuing with sync
  • 13:34 reedy@deploy2002: reedy: Backport for Add missing array_values for PHP 7 compatibility (T385255), SpecialMathWikibase: Null-coalescence getDescription() call (T385170), SpecialMathWikibase: Null-coalescence $par (T385269), ApiQueryContentTranslationSuggestions: Set default value for to and from parameters (T385267) synced to the testservers (h
  • 13:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 75%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73087 and previous config saved to /var/cache/conftool/dbconfig/20250203-133237-root.json
  • 13:28 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Index rebuild
  • 13:27 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2209.codfw.wmnet
  • 13:27 reedy@deploy2002: Started scap sync-world: Backport for Add missing array_values for PHP 7 compatibility (T385255), SpecialMathWikibase: Null-coalescence getDescription() call (T385170), SpecialMathWikibase: Null-coalescence $par (T385269), ApiQueryContentTranslationSuggestions: Set default value for to and from parameters (T385267)
  • 13:23 marostegui@dns1006: END - running authdns-update
  • 13:23 root@cumin1002: START - Cookbook sre.mysql.upgrade for db2209.codfw.wmnet
  • 13:22 marostegui@dns1006: START - running authdns-update
  • 13:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 50%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73085 and previous config saved to /var/cache/conftool/dbconfig/20250203-131732-root.json
  • 13:17 cgoubert@deploy2002: Unlocked for deployment [MediaWiki]: Emergency s3 switchover T385457 (duration: 07m 36s)
  • 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2209 T385457', diff saved to https://phabricator.wikimedia.org/P73084 and previous config saved to /var/cache/conftool/dbconfig/20250203-131631-marostegui.json
  • 13:15 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2205 to s3 primary and set section read-write T385457', diff saved to https://phabricator.wikimedia.org/P73083 and previous config saved to /var/cache/conftool/dbconfig/20250203-131542-root.json
  • 13:15 jebe@deploy2002: Finished deploy [airflow-dags/analytics_product@ce1f0f6]: (no justification provided) (duration: 00m 36s)
  • 13:14 marostegui@cumin1002: dbctl commit (dc=all): 'Set s3 codfw as read-only for maintenance - T385457', diff saved to https://phabricator.wikimedia.org/P73082 and previous config saved to /var/cache/conftool/dbconfig/20250203-131452-root.json
  • 13:14 jebe@deploy2002: Started deploy [airflow-dags/analytics_product@ce1f0f6]: (no justification provided)
  • 13:09 cgoubert@deploy2002: Locking from deployment [MediaWiki]: Emergency s3 switchover T385457
  • 13:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
  • 13:07 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
  • 13:07 cgoubert@deploy2002: Stopping before sync operations
  • 13:06 marostegui: Emergency s3 switchover T385457
  • 13:02 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2205 with weight 0 T385457', diff saved to https://phabricator.wikimedia.org/P73081 and previous config saved to /var/cache/conftool/dbconfig/20250203-130248-root.json
  • 13:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 25%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73080 and previous config saved to /var/cache/conftool/dbconfig/20250203-130226-root.json
  • 13:01 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Primary switchover s3
  • 12:55 cgoubert@deploy2002: Started scap sync-world: Rebuild image and release file for mw-cron
  • 12:53 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
  • 12:53 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
  • 12:53 cgoubert@deploy2002: Finished scap sync-world: Testing scap deployment of mw-cron (duration: 02m 46s)
  • 12:51 cgoubert@deploy2002: Started scap sync-world: Testing scap deployment of mw-cron
  • 12:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 10%: Repooling after rebuild index', diff saved to https://phabricator.wikimedia.org/P73079 and previous config saved to /var/cache/conftool/dbconfig/20250203-124721-root.json
  • 12:25 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
  • 12:25 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
  • 12:19 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
  • 12:19 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
  • 12:11 marostegui@cumin1002: dbctl commit (dc=all): 'es2027 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73078 and previous config saved to /var/cache/conftool/dbconfig/20250203-121113-root.json
  • 11:56 marostegui@cumin1002: dbctl commit (dc=all): 'es2027 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73077 and previous config saved to /var/cache/conftool/dbconfig/20250203-115608-root.json
  • 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'es2027 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73076 and previous config saved to /var/cache/conftool/dbconfig/20250203-114103-root.json
  • 11:28 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb1014.eqiad.wmnet with reason: Kernel reboot
  • 11:27 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb1018.eqiad.wmnet with reason: Kernel reboot
  • 11:25 marostegui@cumin1002: dbctl commit (dc=all): 'es2027 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73075 and previous config saved to /var/cache/conftool/dbconfig/20250203-112558-root.json
  • 11:24 marostegui: Reboot and upgrade db1155
  • 11:23 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1155.eqiad.wmnet with reason: Kernel reboot
  • 11:10 marostegui@cumin1002: dbctl commit (dc=all): 'es2027 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73074 and previous config saved to /var/cache/conftool/dbconfig/20250203-111052-root.json
  • 11:10 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for es2027.codfw.wmnet
  • 11:04 marostegui@dns1006: END - running authdns-update
  • 11:02 marostegui@dns1006: START - running authdns-update
  • 11:00 root@cumin1002: START - Cookbook sre.mysql.upgrade for es2027.codfw.wmnet
  • 10:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2027 for kernel reboot', diff saved to https://phabricator.wikimedia.org/P73073 and previous config saved to /var/cache/conftool/dbconfig/20250203-105935-marostegui.json
  • 10:59 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2034 to es3 codfw master dbtmaint T376905', diff saved to https://phabricator.wikimedia.org/P73072 and previous config saved to /var/cache/conftool/dbconfig/20250203-105915-root.json
  • 10:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73071 and previous config saved to /var/cache/conftool/dbconfig/20250203-103649-root.json
  • 10:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2037 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P73070 and previous config saved to /var/cache/conftool/dbconfig/20250203-103634-root.json
  • 10:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73069 and previous config saved to /var/cache/conftool/dbconfig/20250203-102144-root.json
  • 10:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2037 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P73068 and previous config saved to /var/cache/conftool/dbconfig/20250203-102129-root.json
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73067 and previous config saved to /var/cache/conftool/dbconfig/20250203-100638-root.json
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2037 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P73066 and previous config saved to /var/cache/conftool/dbconfig/20250203-100623-root.json
  • 10:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1221 (T384592)', diff saved to https://phabricator.wikimedia.org/P73065 and previous config saved to /var/cache/conftool/dbconfig/20250203-100300-marostegui.json
  • 10:02 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:02 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 10:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T384592)', diff saved to https://phabricator.wikimedia.org/P73064 and previous config saved to /var/cache/conftool/dbconfig/20250203-100221-marostegui.json
  • 09:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73063 and previous config saved to /var/cache/conftool/dbconfig/20250203-095133-root.json
  • 09:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2037 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P73062 and previous config saved to /var/cache/conftool/dbconfig/20250203-095118-root.json
  • 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P73061 and previous config saved to /var/cache/conftool/dbconfig/20250203-094714-marostegui.json
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73060 and previous config saved to /var/cache/conftool/dbconfig/20250203-093628-root.json
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2037 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P73059 and previous config saved to /var/cache/conftool/dbconfig/20250203-093613-root.json
  • 09:36 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for es2026.codfw.wmnet
  • 09:35 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for es2037.codfw.wmnet
  • 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P73058 and previous config saved to /var/cache/conftool/dbconfig/20250203-093207-marostegui.json
  • 09:14 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2031 to es2 codfw master dbtmaint T376905', diff saved to https://phabricator.wikimedia.org/P73053 and previous config saved to /var/cache/conftool/dbconfig/20250203-091450-root.json
  • 09:13 root@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Index rebuild
  • 09:12 root@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1182.eqiad.wmnet
  • 09:07 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Index rebuild + upgrade
  • 09:06 root@cumin1002: START - Cookbook sre.mysql.upgrade for db1182.eqiad.wmnet
  • 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1182 T385084', diff saved to https://phabricator.wikimedia.org/P73052 and previous config saved to /var/cache/conftool/dbconfig/20250203-090558-marostegui.json
  • 08:37 jelto@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
  • 08:37 jelto@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
  • 08:37 jelto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
  • 08:36 jelto@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
  • 08:35 jelto@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
  • 08:34 jelto@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
  • 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics: apply
  • 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 02:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1199 (T384592)', diff saved to https://phabricator.wikimedia.org/P73051 and previous config saved to /var/cache/conftool/dbconfig/20250203-025443-marostegui.json
  • 02:54 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 02:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T384592)', diff saved to https://phabricator.wikimedia.org/P73050 and previous config saved to /var/cache/conftool/dbconfig/20250203-025421-marostegui.json
  • 02:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P73049 and previous config saved to /var/cache/conftool/dbconfig/20250203-023914-marostegui.json
  • 02:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P73048 and previous config saved to /var/cache/conftool/dbconfig/20250203-022407-marostegui.json
  • 02:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T384592)', diff saved to https://phabricator.wikimedia.org/P73047 and previous config saved to /var/cache/conftool/dbconfig/20250203-020900-marostegui.json

2025-02-02

  • 20:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1190 (T384592)', diff saved to https://phabricator.wikimedia.org/P73046 and previous config saved to /var/cache/conftool/dbconfig/20250202-200724-marostegui.json
  • 20:07 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 15:51 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 14:47 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2212.codfw.wmnet with reason: Maintenance
  • 11:45 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 08:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T384592)', diff saved to https://phabricator.wikimedia.org/P73045 and previous config saved to /var/cache/conftool/dbconfig/20250202-085551-marostegui.json
  • 08:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P73044 and previous config saved to /var/cache/conftool/dbconfig/20250202-084044-marostegui.json
  • 08:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P73043 and previous config saved to /var/cache/conftool/dbconfig/20250202-082537-marostegui.json
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T384592)', diff saved to https://phabricator.wikimedia.org/P73042 and previous config saved to /var/cache/conftool/dbconfig/20250202-081030-marostegui.json
  • 07:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2216 (T384592)', diff saved to https://phabricator.wikimedia.org/P73041 and previous config saved to /var/cache/conftool/dbconfig/20250202-071137-marostegui.json
  • 07:11 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2216.codfw.wmnet with reason: Maintenance
  • 07:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2203 (T384592)', diff saved to https://phabricator.wikimedia.org/P73040 and previous config saved to /var/cache/conftool/dbconfig/20250202-071115-marostegui.json
  • 06:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P73039 and previous config saved to /var/cache/conftool/dbconfig/20250202-065608-marostegui.json
  • 06:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P73038 and previous config saved to /var/cache/conftool/dbconfig/20250202-064101-marostegui.json
  • 06:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2203 (T384592)', diff saved to https://phabricator.wikimedia.org/P73037 and previous config saved to /var/cache/conftool/dbconfig/20250202-062554-marostegui.json
  • 05:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2203 (T384592)', diff saved to https://phabricator.wikimedia.org/P73036 and previous config saved to /var/cache/conftool/dbconfig/20250202-052741-marostegui.json
  • 05:27 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2203.codfw.wmnet with reason: Maintenance
  • 04:37 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2202.codfw.wmnet with reason: Maintenance
  • 04:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T384592)', diff saved to https://phabricator.wikimedia.org/P73035 and previous config saved to /var/cache/conftool/dbconfig/20250202-043646-marostegui.json
  • 04:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P73034 and previous config saved to /var/cache/conftool/dbconfig/20250202-042139-marostegui.json
  • 04:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P73033 and previous config saved to /var/cache/conftool/dbconfig/20250202-040632-marostegui.json
  • 03:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T384592)', diff saved to https://phabricator.wikimedia.org/P73032 and previous config saved to /var/cache/conftool/dbconfig/20250202-035125-marostegui.json
  • 02:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T384592)', diff saved to https://phabricator.wikimedia.org/P73031 and previous config saved to /var/cache/conftool/dbconfig/20250202-025237-marostegui.json
  • 02:52 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 02:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T384592)', diff saved to https://phabricator.wikimedia.org/P73030 and previous config saved to /var/cache/conftool/dbconfig/20250202-025215-marostegui.json
  • 02:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P73029 and previous config saved to /var/cache/conftool/dbconfig/20250202-023708-marostegui.json
  • 02:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P73028 and previous config saved to /var/cache/conftool/dbconfig/20250202-022201-marostegui.json
  • 02:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T384592)', diff saved to https://phabricator.wikimedia.org/P73027 and previous config saved to /var/cache/conftool/dbconfig/20250202-020654-marostegui.json
  • 00:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T384592)', diff saved to https://phabricator.wikimedia.org/P73026 and previous config saved to /var/cache/conftool/dbconfig/20250202-005259-marostegui.json
  • 00:52 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 00:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T384592)', diff saved to https://phabricator.wikimedia.org/P73025 and previous config saved to /var/cache/conftool/dbconfig/20250202-005236-marostegui.json
  • 00:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P73024 and previous config saved to /var/cache/conftool/dbconfig/20250202-003730-marostegui.json
  • 00:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P73023 and previous config saved to /var/cache/conftool/dbconfig/20250202-002223-marostegui.json
  • 00:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T384592)', diff saved to https://phabricator.wikimedia.org/P73022 and previous config saved to /var/cache/conftool/dbconfig/20250202-000716-marostegui.json

2025-02-01

  • 22:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T384592)', diff saved to https://phabricator.wikimedia.org/P73021 and previous config saved to /var/cache/conftool/dbconfig/20250201-225519-marostegui.json
  • 22:55 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 22:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T384592)', diff saved to https://phabricator.wikimedia.org/P73020 and previous config saved to /var/cache/conftool/dbconfig/20250201-225456-marostegui.json
  • 22:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P73019 and previous config saved to /var/cache/conftool/dbconfig/20250201-223949-marostegui.json
  • 22:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P73018 and previous config saved to /var/cache/conftool/dbconfig/20250201-222442-marostegui.json
  • 20:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2173 (T384592)', diff saved to https://phabricator.wikimedia.org/P73016 and previous config saved to /var/cache/conftool/dbconfig/20250201-205602-marostegui.json
  • 20:55 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 20:55 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 20:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T384592)', diff saved to https://phabricator.wikimedia.org/P73015 and previous config saved to /var/cache/conftool/dbconfig/20250201-205525-marostegui.json
  • 20:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P73014 and previous config saved to /var/cache/conftool/dbconfig/20250201-204018-marostegui.json
  • 20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P73013 and previous config saved to /var/cache/conftool/dbconfig/20250201-202511-marostegui.json
  • 20:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T384592)', diff saved to https://phabricator.wikimedia.org/P73012 and previous config saved to /var/cache/conftool/dbconfig/20250201-201004-marostegui.json
  • 19:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170 (T384592)', diff saved to https://phabricator.wikimedia.org/P73011 and previous config saved to /var/cache/conftool/dbconfig/20250201-190526-marostegui.json
  • 19:05 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 19:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T384592)', diff saved to https://phabricator.wikimedia.org/P73010 and previous config saved to /var/cache/conftool/dbconfig/20250201-190504-marostegui.json
  • 18:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P73009 and previous config saved to /var/cache/conftool/dbconfig/20250201-184957-marostegui.json
  • 18:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P73008 and previous config saved to /var/cache/conftool/dbconfig/20250201-183450-marostegui.json
  • 18:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T384592)', diff saved to https://phabricator.wikimedia.org/P73007 and previous config saved to /var/cache/conftool/dbconfig/20250201-181943-marostegui.json
  • 17:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T384592)', diff saved to https://phabricator.wikimedia.org/P73006 and previous config saved to /var/cache/conftool/dbconfig/20250201-170624-marostegui.json
  • 17:06 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 17:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T384592)', diff saved to https://phabricator.wikimedia.org/P73005 and previous config saved to /var/cache/conftool/dbconfig/20250201-170602-marostegui.json
  • 16:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P73004 and previous config saved to /var/cache/conftool/dbconfig/20250201-165055-marostegui.json
  • 16:41 cmooney@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cr2-magru with reason: IBGP instability from cr1 to cr2 in magru causing ping faulures from alert1002
  • 16:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P73003 and previous config saved to /var/cache/conftool/dbconfig/20250201-163548-marostegui.json
  • 16:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T384592)', diff saved to https://phabricator.wikimedia.org/P73002 and previous config saved to /var/cache/conftool/dbconfig/20250201-162041-marostegui.json
  • 15:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2146 (T384592)', diff saved to https://phabricator.wikimedia.org/P73001 and previous config saved to /var/cache/conftool/dbconfig/20250201-151709-marostegui.json
  • 15:17 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 15:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T384592)', diff saved to https://phabricator.wikimedia.org/P73000 and previous config saved to /var/cache/conftool/dbconfig/20250201-151646-marostegui.json
  • 15:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P72999 and previous config saved to /var/cache/conftool/dbconfig/20250201-150139-marostegui.json
  • 14:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P72998 and previous config saved to /var/cache/conftool/dbconfig/20250201-144632-marostegui.json
  • 14:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T384592)', diff saved to https://phabricator.wikimedia.org/P72997 and previous config saved to /var/cache/conftool/dbconfig/20250201-143125-marostegui.json
  • 13:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T384592)', diff saved to https://phabricator.wikimedia.org/P72996 and previous config saved to /var/cache/conftool/dbconfig/20250201-131925-marostegui.json
  • 13:19 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 12:22 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 11:18 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 10:22 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 09:24 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 09:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T384592)', diff saved to https://phabricator.wikimedia.org/P72995 and previous config saved to /var/cache/conftool/dbconfig/20250201-092349-marostegui.json
  • 09:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P72994 and previous config saved to /var/cache/conftool/dbconfig/20250201-090842-marostegui.json
  • 08:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P72993 and previous config saved to /var/cache/conftool/dbconfig/20250201-085335-marostegui.json
  • 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T384592)', diff saved to https://phabricator.wikimedia.org/P72992 and previous config saved to /var/cache/conftool/dbconfig/20250201-083827-marostegui.json
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1235 (T384592)', diff saved to https://phabricator.wikimedia.org/P72991 and previous config saved to /var/cache/conftool/dbconfig/20250201-073139-marostegui.json
  • 07:31 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T384592)', diff saved to https://phabricator.wikimedia.org/P72990 and previous config saved to /var/cache/conftool/dbconfig/20250201-073116-marostegui.json
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P72989 and previous config saved to /var/cache/conftool/dbconfig/20250201-071609-marostegui.json
  • 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P72988 and previous config saved to /var/cache/conftool/dbconfig/20250201-070103-marostegui.json
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T384592)', diff saved to https://phabricator.wikimedia.org/P72987 and previous config saved to /var/cache/conftool/dbconfig/20250201-064555-marostegui.json
  • 05:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T384592)', diff saved to https://phabricator.wikimedia.org/P72986 and previous config saved to /var/cache/conftool/dbconfig/20250201-053027-marostegui.json
  • 05:30 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 05:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T384592)', diff saved to https://phabricator.wikimedia.org/P72985 and previous config saved to /var/cache/conftool/dbconfig/20250201-053005-marostegui.json
  • 05:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P72984 and previous config saved to /var/cache/conftool/dbconfig/20250201-051458-marostegui.json
  • 04:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P72983 and previous config saved to /var/cache/conftool/dbconfig/20250201-045951-marostegui.json
  • 04:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T384592)', diff saved to https://phabricator.wikimedia.org/P72982 and previous config saved to /var/cache/conftool/dbconfig/20250201-044444-marostegui.json
  • 03:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T384592)', diff saved to https://phabricator.wikimedia.org/P72981 and previous config saved to /var/cache/conftool/dbconfig/20250201-033412-marostegui.json
  • 03:34 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 03:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T384592)', diff saved to https://phabricator.wikimedia.org/P72980 and previous config saved to /var/cache/conftool/dbconfig/20250201-033350-marostegui.json
  • 03:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P72979 and previous config saved to /var/cache/conftool/dbconfig/20250201-031843-marostegui.json
  • 03:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P72978 and previous config saved to /var/cache/conftool/dbconfig/20250201-030337-marostegui.json
  • 02:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T384592)', diff saved to https://phabricator.wikimedia.org/P72977 and previous config saved to /var/cache/conftool/dbconfig/20250201-024829-marostegui.json
  • 01:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T384592)', diff saved to https://phabricator.wikimedia.org/P72976 and previous config saved to /var/cache/conftool/dbconfig/20250201-013748-marostegui.json
  • 01:37 marostegui@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 01:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T384592)', diff saved to https://phabricator.wikimedia.org/P72975 and previous config saved to /var/cache/conftool/dbconfig/20250201-013726-marostegui.json
  • 01:25 brett: import ncmonitor 1.3.1 into bookworm-wikimedia
  • 01:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P72974 and previous config saved to /var/cache/conftool/dbconfig/20250201-012219-marostegui.json
  • 01:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P72973 and previous config saved to /var/cache/conftool/dbconfig/20250201-010712-marostegui.json
  • 00:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T384592)', diff saved to https://phabricator.wikimedia.org/P72971 and previous config saved to /var/cache/conftool/dbconfig/20250201-005205-marostegui.json

Archives

See Server Admin Log/Archives.